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THE GENE CLUSTER INVOLVED IN SAFRACIN BIOSYNTHESIS 
AND ITS USES FOR GENETIC ENGINEERING 

FIELD OF THE INVENTION 



The present invention relates to the gene cluster responsible for the 
biosynthesis of safracin, its uses for genetic engineering and new safracins 
obtained by maniptdation of the biosynthesis mechanism. 



BACKGROUND OF THE INVENTION 



Safracins, a family of new compounds with a potent broad- spectrum 
antibacterial activity, were discovered in a culture broth of Pseudomonas 
sp. Safracin occurs in two Pseudomonas sp. strains, Pseudomonas 
fluorescens A2-2 isolated from a soil sample coUected in Tagawagun, 
Pukuoka, Japan (Ikeda et al. J. Antibiotics 1983, 36,1279-1283; WO 82 
00146 and JP 58113192) and Pseudomonas fluorescens SO 12695 isolated 
from water samples taken fix)m the Raritan-Delaware Canal, near New 
Jersey (Meyers et al. J. Antibiot. 1983, 36(2), 190-193). Safracins A and B, 
produced by Pseudomonas fluorescens A2-2, have been examined against 
different tumor cell lines and has been foxmd to possess antitumor activity 
in addition to antibacterial activity. 




SafracinsA R=H ET-743 
B R=OH 



Due to the structural similarities between safracin B and ET-743 safracin 
offers the possibiUty of hemi-synthesis of the highly promising potent new 
antitumor agent ET-743, isolated from the marine tunicate Ecteinascidia 
turbmata and which is cxarrently in Phase H clinical trials in Eiirope and 
the United States. A hemisynthesis of ET-743 has been achieved starting 
from safracin B (Cuevas et al. Organic Lett. 2000, 10, 2545-2548; WO 00 
69862 and WO 01 87895). 

As an alternative of making safracins or its structural analogs by 
chemical synthesis, manipulating genes of governing secondary 
metaboUsm offer a promising alternative and allows for preparation of 
these compounds biosynthetically. Additionally, safracin structure offers 
exciting possibilities for combinatorial biosynthesis. 

In view of the complex structure of the safracins and the limitations 
in their obtention from Pseudomonas fluorescens A2-2, it would be highly 
desirable to understand the genetic basis of their synthesis in order to 
create the means to influence them in a targeted manner. This could 
increase the amounts of safracins being produced, because natural 
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production strains generally yield only low concentrations of the secondary 
metabolites that are of interest. It could gJso allow the production of 
safracins in hosts that otherwise do not produce these compounds. 
Additionally, the genetic manipulation could be used for combinatorial 
creation of novel safracin analogs that could exhibit improved properties 
and that coiold be used in the hemi-synthesis of new ecteinascidins 
compoionds. 



However, the success of a biosynthetic approach depends critically 
on the availability of novel genetic systems and on genes encoding novel 
enzyme activities. Elucidation of the safracin gene cluster contributes to 
the general field of combinatorial biosynthesis by expanding the repertoire 
of genes viniquely associated with safracin biosynthesis, leading to the 
possibility of making novel precursors and safracins via combinatorial 
biosynthesis. 



SUMSSARY OP THE INVENTION 



We have now been able to identify and clone the genes of safracin 
biosynthesis, providing the genetic basis for improving and manipulating 
in a targeted manner the productivity of Pseudomonas sp., and using 
genetic methods, for synthesising safracin analogues. Additionally, these 
genes encode en^ncnes that are involved in biosynthetic processes to 
produce structures, such as safracin precursors, that can form the basis 
of combinatorial chemistiy to produce a wide variety of compovmds. 
These compoimds can be screened for a variety of bioactivities including 
anticancer activity. 




PCT/GB2003/005563 



Therefore in a first aspect the present invention provides a nucleic acid, 
smtably an isolated nucleic acid, which includes a DNA sequence 
(including mutations or variants thereof), that encodes non-ribosomal 
peptide synthetases which are responsible for the bioss^thesis of 
safracins. This invention provides a gene cluster, suitably an isolated 
gene cluster, with open reading frames encoding polyjieptides to direct the 
assembly of a safracin molecule. 



One aspect of the present invention is a composition including at 
least one nucleic acid sequence, suitably an isolated nucleic acid molecvde, 
that encodes at least one polypeptide that catalyses at least one step of the 
biosynthesis of safracins. Two or more such nucleic acid sequences can 
be present in the composition. DNA or corresponding RNA is also 
provided. 

In particular the present invention is directed to a nucleic acid 
sequence, suitably an isolated nucleic acid sequence, from a safracin gene 
cluster comprising said nucleic acid sequence, a portion or portions of said 
nucleic acid sequence wherein said portion or portions encode a 
polypeptide or polypeptides or a biologicaUy active fragment of a 
polypeptide or polypeptides, a single-stranded nucleic acid sequence 
derived from said nucleic acid sequence, or a single stranded nucleic acid 
sequence derived from a portion or portions of said nucleic acid sequence, 
or a double-stranded nucleic acid sequence derived from the single- 
stranded nucleic acid sequence (such as cDNA from mRNA). The nucleic 
acid sequence can be DNA or RNA. 

More particularly, the present invention is directed to a nucleic acid 
sequence, suitably an isolated nucleic acid sequence, which includes or 
comprises at least SEQ ID 1, variants or portions thereof, or at least one of 
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the sacA, sacB, sacC, sacC, sacD, sacB, sacF, sacG, sacH, sacH, sad, sacJ, 
orfl, orf2, orf3 or orf4 genes, including variants or portions. Portions can 
be at least 10, 15, 20, 25, 50, 100, 1000, 2500, 5000, 10000, 20000, 
25000 or more nucleotides in length. Typically the portions are in the 
range 100 to 5000, or 100 to 2500 nucleotides in length, and are 
biologically functional. 

Mutants or variants include polynucleotide molecxiles in which at 
least one nucleotide residue is altered, substituted, deleted or inserted. 
Multiple changes are possible, with a different nucleotide at 1, 2, 3, 4, 5, 
10, 15, 25, 50, 100, 200, 500 or more positions. Degenerate variants are 
envisaged which encode the same polypeptide, as well as non-degenerate 
variants which encode a different pol3npeptide. The portion, mutant or 
variant nucleic acid sequence suitably encodes a polypeptide which retains 
a biological activity of the respective polypeptide encoded by any of the 
open reading frames of the safracin gene cluster. Allelic forms and 
polymorphisms are embraced. 

The invention is also directed to an isolated nucleic acid sequence 
capable of hybridizing imder stringent conditions with a nucleic acid 
sequence of this invention. Particiilarly preferred is hybridisation with a 
translatable length of a nucleic acid sequence of this invention. 

The invention is also directed to a nucleic acid encoding a 
polypeptide which is at least 30%, preferably 50%, preferably 60%, more 
preferably 70%, in particular 80%, 90%, 95% or more identical in amino 
acid sequence to a polypeptide encoded by any of the safracin gene cluster 
open reading frames sacA to sacJ and orfl to orf4 (SEQ ID 1 and genes 
encoded in SEQ ID 1) or encoded by a variant or portion thereof. The 
polypeptide suitably retains a biological activity of the respective 
polypeptide encoded by any of the safracin gene cluster open reading 



wo 2004/056998 

frames. 



6 



PCT/GB2003/00S563 



In particialar, the invention is directed to an isolated nucleic acid 
sequence encoding for any of SacA, SacB, SacC, SacD, SacE, SacF, SacG, 
SacH, Sad, SacJ, Orfl, Orf2, Orf3 or Orf4 proteins (SEQ ID 2-15), and 
variants, mutants or portions thereof. 

In one aspect, an isolated nucleic acid sequence of this invention 
encodes a peptide synthetase, a L-Tyr derivative hidroxylase, a L-Tyr 
derivative methylase, a L-Tyr O-methylase, a methyl-transferase or a 
monooxygenase or a safracin resistance protein. 

The invention also provides a hybridization probe which is a nucleic 
acid sequence as defined above or a portion thereof. Probes smtably 
comprise a sequence of at least 5, 10, 15, 20, 25, 30, 40, 50, 60, or more 
nucleotide residues. Sequences with a length on the range 25 to 60 are 
preferred. The invention is also directed to the use of a probe as defined 
for the detection of a safracin or ecteinascidin gene. In particular, the 
probe is used for the detection of genes in Ecteinascidia turbinatcu 

In a related aspect the invention is directed to a polypeptide encoded 
by a nucleic acid sequence as defined above. Full sequence, variant, 
mutant or fragment polypeptides are envisaged. 

In a further aspect the invention is directed to a vector, preferably an 
expression vector, preferably a cosmid, comprising a nucleic acid sequence 
encoding a protein or biologically active fragment of a protein, wherein said 
nucleic acid is as defined above. 

In another aspect the invention is directed to a host cell transformed 
with one or more of the nucleic acid sequences as defined above, or a 
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vector, an expression vector or cosmid as defined above. A preferred host 
cell is transformed with an exogenous nucleic acid comprising a gene 
cluster encoding polypeptides sufficient to direct the assembly of a safracin 
or safracin analog. Preferably the host cell is a microorganism, more 
preferably a bacteria. 

The invention is also directed to a recombinant bacterial host cell in 
which at least a portion of a nucleic acid sequence as defined above is 
disrupted to resxilt in a recombinant host cell that produces altered levels 
of safracin compound or safracin analogue, relative to a corresponding 
nonrecombineucit bacterial host cell. 

The invention is also directed to a method of producing a safracin 
compound or safracin analogue comprising fermenting, imder conditions 
and in a medium sioitable for producing such a compound or analogue, an 
organism such as Pseudomonas sp, in which the copy number of the 
safracin genes/cluster encoding polypeptides sufficient to direct the 
assembly of a safi-acin or safracin smalog has been increased. 

The invention is also directed to a method of producing a safracin 
compound or analogue comprising fermenting, under conditions and in a 
medium suitable for producing such compound or analogue, an organism 
such as Pseudomonas sp in which expression of the genes encoding 
pol3TDeptides sufficient to direct the assembly of a safracin or safracin 
analogue has been modulated by manipulation or replacement of one or 
more genes or sequence responsible for regulating such expression. 
Preferably expression of the genes is enhanced. 

Tlie invention is also directed to the use of a composition including 
at least one isolated nucleic acid sequence as defined above or a 
modification thereof for the combinatorial bios5^thesis of non-ribosomal 
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peptides, diketx)piperazine rings and sa&acins. 

In particialar the method involves contacting a compound that is a 
substrate for a polypeptide encoded by one or more of the safracin 
biosynthesis gene cluster open reading frames as defined above with the 
poljrpeptide encoded by one or more safracin biosynthesis gene cluster 
open reading firames, whereby the polypeptide chemically modifies the 
compound. 

In still another embodiment, this invention provides a method of 
producing a safiracin or safiracin analog. The method involves providing a 
microorganism transformed with an exogenous nucleic acid comprising a 
safracin gene cluster encoding polypeptides sufficient to direct the 
assembly of said safracin or safracin analog; cxoltiiring the bacteria under 
conditions permitting the biosynthesis of safracin or safracin analog; and 
isolating said safracin or safracin analog from said cell. 

The invention is also directed to any of the precursor compoimds P2, 
P14, analogs and derivatives thereof and their use in the combinatorial 
biosynthesis non-ribosomal peptides, diketopiperazine rings and safracins. 

Additionally, the invention is also directed to the new safracins 
obtained by knock out safracin P19B, safracin P22A, safracin P22B, 
safracin D and safracin E, and their use as antimicrobial or antitumor 
agents, as well as their use in the synthesis of ecteinascidin compounds. 

The invention is also directed to new safracins obtained by directed 
biosynthesis as defined above, and their use as antimicrobial or antitumor 
agents, as weU as their use in the synthesis of ecteinascidin compounds. 
In particudar the invention is directed to safracin B-ethoxy and safracin A- 
ethoxy and their use. 



wo 2004/056998 




PCT/GB2003/005563 



In one aspect, the present invention enables the preparation of 
structures related to safracins and ecteinascidins which cannot or are 
difficult to prepare by chemical synthesis. Another aspect is to use the 
knowledge to gain access to the biosynthesis of ecteinascidins in 
Ecteinascidia turbinata, for example using these sequences or parts as 
probes in this organism or a putative ssmibiont. 

More fundamentally, the invention opens a broad field and gives 
access to ecteinascidins by genetic engineering. 

BRIEF DESCRIPTION OF THE FIGURES 

Fig. 1: Structural organization of the chromosomal DNA region cloned in 
pL30p cosmid. The region of P. fluorescens A2-2 DNA, containing the 
safracin gene cluster, is shown. Both, sacABCDEFGH and sacIJ, gene 
operons and the modular organization of the peptide synthetases deduced 
from sacA, sacB and sacC are iUustrated. The following domains are 
indicated: C: condensation; T: thiolation; A: adenylation and Re: reductase. 
Location of other genes present in pL30p cosmid {orfl to orf4j as well as 
their proposed function is shown. 

Fig. 2: Conserved core motifs between NRPSs. Conserved amino acid 
sequences in SacA, SacB and SacC proteins and their comparison with its 
homologous sequences from Myxococxxis jomthus DM50415. 

Figure 3. NRPS biosynthesis mechanism proposed for the formation of the 
Ala-Gly dipeptide. Step a*, adenylation of Ala; b*, transfer to the 4'- 
phosphopantetheinyl arm; c*, transfer to the waiting/ elongation site; d*, 
adenylation of the Gly; e*, transfer to the 4'-phosphopantetheinyl arm; f*. 
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condensation of the elongation chain on the 4--phosphopantetheinyl arm 
with the starter chain at the waiting/ elongation site; g*, Ala-Gly dipeptide 
attached to the phosphopantetheinyl arm of SacA and h*, transfer of the 
elongated chain to the following waiting/ elongation site. 

Fig. 4: Cross-feeding experiments. A. Scheme of A2-2 DNA fragments 
cloned in pBBRl-MCS2 vector and products obtained in the heterologous 
host. B. HPLC profUe of safracin production in wild type strain versus sacF 
mutant. The addition of P2 precursor to the sacF mutant, provided both in 
treats and synthetically, yield safracin B production. SfcA, safracin A and 
SfcB, safracin B. 

Fig. 5: Scheme of the safracin biosynthesis mechanism and biosynthetic 
intermediates. Single enzymatic steps are indicated by a continuous arrow 
and multiple reactions steps are indicated by discontinuous arrows. 

Fig. 6: Safracin gene disruptions and compoimds produced. A. Gene 
disruption and precursor molecules synthesized by the mutants 
constructed. Gene marked with an asterisk does not belong to the safracin 
cluster. Inactivation of genes orfl, orf2, orf3 and orf4 has demonstrated to 
have no effect over safracin production. B. HPLC profile of safracin 
production in wild type strain and . in sacA, sad and sacJ mutants. 
Structure of the different molecules obtained is shown. 

Fig. 7: Structure of the different molecules obtained by gene disruption. 
Inactivation of SacJ protein (a) yields P22B, P22A and P19 moleciiles, 
whereas gene disruption of sad (b), produces only P19 compound. The 
sacf disruption, together with the soo/ reconstructed expression, produces 
two new safracins: safracin D (possible preciirsor for ET-729 hemi- 
synthesis) and safracin E (c). 
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Fig. 8: Addition of specific designed 'unnatural' precursors (P3). Chemical 
structure of the two molecules obtained by addition of P3 compound to the 
sacF mutant. 

Fig. 9: Scheme of the gene disruption event through simple recombination, 
using axi homologous DNA fragment cloned into pK18:MOB (an integrative 
plasmid in Pseudomonas). 



DETAILED DESCRDPTION OF THE INVENTION 

Non ribosomal peptide synthetases (NRPS) are en2ymes responsible for the 
biosynthesis of a family of compounds that include a large niomber of 
structurally and functionally diverse natural products. For example, 
peptides with biological activities provide the structural backbone for 
compoimds that exhibit a variety of biological activities such as, 
antibiotics, antiviral, antitumor, and immimosuppressive agents (Zuber et 
al. Biotechnology of Antibiotics 1997 (W. Strohl, ed.), 187-216 Marcel 
dekker. Inc., N.Y; Marahiel et al. Chem. Rev. 1997, 97, 2651-2673). 

Although structurally diverse, most of these biologically active peptides 
share a common mechanistic scheme of biosynthesis. According to this 
model, peptide bond formation takes place on multien2ymes designated 
peptides synthetases, on which amino acid substrates are activated by ATP 
hydrolysis to the corresponding adenylate. This xmstable intermediate is 
subsequently transferred to another site of the miiltienzymes where it is 
bovmd as a thioester to the cysteamine group of an enzyme-bound 4'- 
phosphopantetheninyl (4'-PP) cofactor. At this stage, the thiol-activated 
substrates can imdergo modifications such as epimerisation or N- 
methylation. Thioesterified substrate amino acids are then integrated into 
the peptide product through a step-by-step elongation by a series of 
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transpeptddation reactions. With this template arrangement in peptide 
synthetases, the modules seem to operate independentiy of one another, 
but they act in concert to catalyse the formation of successive peptide 
bonds (Stachelhaus et al. Science 1995, 269, 69-72; Stachelhaus et al. 
Cfiem. Biol. 1996, 3, 913-921). The general scheme for non-ribosomal 
peptide biosynthesis has been widely reviewed (Marahiel et al. Chem. Rev. 
1997, 97, 2651-2673; Konz and Marahiel, Chem. and Biol. 1999, 6, R39- 
R48; Moffit and Neilan, FEMS Microbiol Letters 2000, 191, 159-167). 

A large number of bacterial operons and fungal genes encoding 
peptide synthetases have recently been cloned, sequenced and partially 
characterized, providing valuables insights into their moleciile architecture 
(Marahiel, Chem and Biol 1997, 4, 561-567). Different cloning strategies 
were used, including probing of expression libraries by antibodies raised 
against peptide synthetases, complementation of deficient mutants, and 
the use of designed oligonucleotides derived from amino acid sequences of 
peptide synthetase fragments. 

Analysis of the primary structure of these genes revealed the 
presence of distinct homologous domains of about 600 amino acids. This 
specific functional domains consist of at least six highly conserved core 
sequences of about three to eight amino acids in length, whose order and 
location within all known domains are very similar (Kusard and Marahiel, 
Peptide Research 1994, 7, 238-241). The used of degenerated 
oUgonucleotides derived from the conserved cores opens the possibiUty of 
identifying and cloning peptide synthetases from genomic DNA, by using 
the polymerase chain reaction (PGR) technology (Kusard and Marahiel, 
Peptide Research 1994, 7, 238-241; Borchert et al. FEMS Microbiol Letters 
1992, 92,175-180). 



structure of safracin suggests that this compoimd is synthesized 
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by a NRPS mechanism. The cloning and expression of the non-ribosomal 
peptide synthetases and the associated tailoring enzymes from 
Pseudomonas fluorescens A2-2 safracin cluster would allow production of 
imlimited amounts of safracin. In addition, the cloned genes could be used 
for combinatorial creation of novel safracin analogs that coxild exhibit 
improved properties and that could be used in the hemi-synthesis of new 
ecteinascidins. Moreover, cloning and expressing the safracin gene cluster 
in heterologous systems or the combination of safracin gene cluster with 
other NRPS genes could result in the creation of novel drugs with improved 
activities. 



The present invention provides, in particular, the DNA sequence 
encoding NRPS responsible for biosynthesis of safracin, i.e., safracin 
synthetases. We have characterized a 26,705 bp region (SEQ ID NO:l) 
from Pseudomonas fluorescens A2-2 genome, cloned in pL30P cosmid and 
demonstrated, by knockout experiments and heterologous expression, that 
this region is responsible for the safracin biosynthesis. We expressed the 
pL30P cosmid in two strains of Pseudomonas sp., which do not produce 
safracin, and the result was a production of safracin A and B at levels of a 
22%, for P. fluorescens (CECT 378), and 2%, for P. aeruginosa (CECT 1 10), 
in comparison with P. fluorescens A2-2 production. The predicted amino 
acids sequences of the various peptides encoded by this DNA sequence is 
shown in SEQ ID NO:2 through SEQ ID NO: 15 respectively. 

The gene cluster for safracin biosynthesis derived from P. fluorescens 
A2-2, is characterized by the presence of several open reading frames 
(ORF) that are organized in two divergent operons (Fig. 1), an eight genes 
operon (sacABCDEFGH) and a two genes operon (sacZJ), preceded by well- 
conserved putative promoters regions that overlap. The safracin 
biosynthesis gene cluster is present in only one copy in P. fluorescens A2-2 
genome. 
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Our results indicate that the eight genes operon would be 
responsible for the safracin skeleton biosynthesis and the two genes 
operon would be responsible for the final tailoring of safracins. 

In the sacABCDEFGH operon, the deduced amino acid sequences 
encoded by sacA, sacB and sacC strongly resemble gene products of 
NRPSs. Within the deduced amino acid sequences of SacA, SacB and 
SacC, one peptide synthetase module was identified on each of the ORFs. 

The first surprising feature of the safracin NRPS proteins is that from 
the known active sites and core regions of peptide synthetases (Konz and 
Marahiel, Chem. and Biol 1999, 6, R39-R48), the first core is poorly 
conserved in all three peptide sjmthetases, SacA, SacB and SacC (Pig. 2). 
The other five core regions are well conserved in the three safracin NRPSs 
genes. The biological significance of the first core (LKAGA) is unknown, but 
the SGT(ST)TGxPKG (Gocht and Marahiel, J. BacterioL 1994, 176, 2654- 
266; Konz and Marahiel, Chem. and Biol 1999, 6, R39-R48), the TGD 
(Gocht and Marahiel, J. Bacteriol 1994, 176, 2654-2662; Konz and 
Marahiel, 1999) and the KIRGxRIEL (Pavela-Vrancic et al. J. Biol Chem 
1994, 269, 14962-14966; Konz and Marahiel, Chem. and Biol 1999, 6, 
R39-R48) core sequences covdd be assigned to ATP binding and hydrolysis. 
The serine residue of the core sequence LGGxS could be shown to be the 
site of thioester formation (D'Souza et al., J. Bacteriol 1993, 175, 3502- 
3510; VoUenbroich et al., FBBS Lett. 1993, 325(3), 220-4; Konz and 
Marahiel, Chem. and Biol 1999, 6, R39-R48) and 4--phosphopantetheine 
binding (Stein et al. FEBS Lett. 1994, 340, 39-44; Konz and Marahiel, 
Chem. ami Biol 1999, 6, R39-R48). These findings, together with the fact 
that safracin seems to be synthesized from amino acids, supports the 
hypothesis that non-ribosomal peptide bond formation via the thiotemplate 
mechanism is involved in the biosynthetic pathway of safracin and that 
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sacA, sacB and saxK^ encode the corresponding peptide synthetases. 
According to this mechanism, amino acids are activated as aminoacyl- 
adenylates by ATP hydrolysis and subsequently covalently bo\md to the 
enzyme via carboxyl-thioester linkages. Then, in further steps, 
transpeptidation and peptide bond formation occurs. 



Secondly, it is striking that our sequence data clearly shows that the 
colinearity rule, according to which the order of the amino acid binding 
modules along the chromosome parallels the order of the amino acids in 
the peptide, does not hold for the sajfracin synthetase system. According to 
the sequence database homologies and safracin and saframycin structures 
homologies, SacA would be responsible for the recognition and activation 
of the Gly residue and SacB and SacC would be responsible for the 
recognition and activation of the two L-Tyr derivatives that are 
incorporated into the safracin skeleton, while the putative Ala-NRPS gene 
would be missing in the safracin gene cluster. In a few nonribosomal 
peptide synthetases gene clusters, such as in the pristamycin (Crecy- 
Lagard et al, J. of Bacteriol 1997, 179(3), 705-713) and in the 
phosphinothricin tripeptide (Schwartz et al. Appl Environ Microbiol 1996, 
62, 570-577) biosynthesis pathways, the first NRPS is not jioxtaposed with 
the second NRPS gene. In concrete, in the pristamycin biosynthetic 
pathway the first structural gene {snbA) and the second structural gene 
(snbQ are 130kb apart. This is not the case for the safracin gene cluster 
where the resxilts of the heterologous expression with the pL30P cosmid 
clearly demonstrates that there is no NRPS gene missing since there is 
heterologous safracin production. 

Thirdly, even though the question about the mechanism by which the 
dipeptide Ala-Gly is formed remains open, the presence in sacA of an extra 
C domain at the amino terminus of the first NRPS gene, suggests the 
possibiUty of a bifunctional adenylation activation activity by this gene. We 
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propose that the Ala woiald be first charged on the phosphopantetheinyl 
arm of SacA (Pig. 3 a* and b*) before being transferred to a waiting 
position, a condensation domain, located in N-terminal of sacA (Fig. 3, c*). 
The Gly adenylate would then be charged on the same 
phosphopantetheinyl arm (Fig. 3, d* and e*), positioned to the elongation 
site, and elongation would occur (Fig. 3, f*). The arm of the first modxale 
would at this stage be charged with a Ala-Gly dipeptide (Pig. 3, g*). We 
proposed that the dipeptide would then be transferred on a waiting 
position in the second phosphopantetheinyl arm (Pig. 3, h*), located in 
SacB, to continue the synthesis of the safracin tetrapeptide basic skeleton. 
An alternative biosynthesis mechanism could be the direct incorporation of 
a dipeptide Ala-Gly into SacA. In this case, the dipeptide could be 
originated from the activity of highly active peptidyl transferase ribo2yme 
famUy (Sun et al, Chem. and Biol. 2002, 9, 619-626) or from the activity of 
bacterial proteolysis. 



And fourthly, although in most of the prokaiyotic peptide synthetases 
the thioesterase moiety, which appears to be responsible for the release of 
the mature peptide chain fi-om the enayme, is fused to the C-terminal end 
of the last amino add binding module (Marahiel et al. Chem. Rev. 1997, 
97, 2651-2673), in the case of safracin synthetases, the TE domain is 
missing. Probably, in the safracin synthesis after the last elongation step, 
the tetrapeptide covdd be released by an alternative strategy for peptide- 
chain termination that also occurs in the saframycin synthesis (Pospiech 
et al. Microbiol 1996, 142, 741-746). This particular termination strategy 
is catalysed by a reductase domain at the carboxy-terminal end of the 
SacC peptide synthetase which catalyses the reductive cleavage of the 
associated T-domain-tethered acyl group, releasing a linear aldehyde. 



Oior cross feeding experiments indicate that the last two amino acids 
incorporated into the safi-acin molecule are two L-Tyr derivatives caUed P2 
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(3-hydroxy-5-methyl-Omethyllyrosine) (Fig. 4, 5), instead of two L-Tyr as 
it is proposed to occur in saframycin synthesis. First, the products of two 
genes (socF and sacG), similar to bacterial methyltransferases, have shown 
to be involved in the 0-, C-methylation of L-Tyr to produce PI 4 (3-methyl- 
O-methyltyrosine), precursor of P2. A possible mechanism could envisage 
that the O-methylation occurs first and then the C-methylation of the 
amino acid derivative is produced. Secondly, P2, the substrate for the 
peptide synthetases SacB and SacC, is formed by the hydroxylation of P14 
by SacD (Fig. 4, 5). 




COOH 




OOH 



C11H15NO3 C11H15NO4 
Exact Mass: 209,1 1 Exact Mass: 225, 1 0 

Mol. Wt: 209,24 moI. Wt • 225 24 

C, 63,14; H, 7,23; N, 6.69; 0. 22.94 C, 58.66; H, 6.71 ; N. 6.22; 0, 28.41 



P-14 p.2 



Apart from the safracin biosynthetic genes, in the sacABCDEFGH 
operon there are also fovmd two genes, sacE and sacH, involved in an 
unknown function and in the safracin resistance mechanism, respectively. 
We have demonstrated that sacH gene codes for a protein that when is 
heterologous expressed, in different Pseudomonas strains, a highly 
increase of the safracin B resistance is produced. SacH is a putative 
transmembrane protein, that transforms the C21-OH group of safracin B 
into a C21-H group, to produce safracin A, a compound with less antibiotic 
and antitumoral activity. FinaUy, even though still is unknown about the 
putative function of SacE, homologous of this gene have been found close 
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to various secondary metabolites biosynthetic gene clusters in some 
microorganisms genomes, suggesting a conserved function of this genes in 
secondaiy metabolite formation or regulation. 

In the sadJ operon, the deduced amino acid sequences encoded by sad 
and sacJ strongly resemble gene products of methyltransferase and 
hydroxylase/monoxygenase, respectively. Our data reveals that Sad is the 
enayme responsible for the iV-methylation present in the safracin 
structure, and that SacJ is the protein which makes an additional 
hydroxylation on one of the L-Tyr derivative incorporated into the 
tetrapeptide to produce the quinone structure present in all safracin 
molecules. JV-Methylation is one of the modifications of nonribosomally 
synthesized peptides that significantly contributes to their biological 
activity. Except for saframycin (Pospiech et al. Microbiol 1996, 142, 741- 
746), that is produced by bacteria and is iV-methylated, all the N- 
methylated nonribosomal peptides .known are produced by fungi or 
actinomycetes and, in most of the cases, the responsible for the N- 
methylation is a domain which reside in the nonribosomal peptide 
S5mthetase. 



Table I. Summary of safracin biosynthetic and resistance genes identified 
in cosmid pL30P. 



ORF Protein 
nam name 
e 



Proposed function 



Position 
start-stop bp 



Amino acids 



Molecular 
weight 



sacA SacA 

sacB SacB 

sacC SacC 

sacD SacD 



Peptide synthetase 
Peptide synthetase 
Peptide syn&etase 
L-Tyr derivative 
hidroxylase 



3052-6063 
6068-9268 
9275-13570 
13602-14651 



1004 
1063 
1432 
350 



110.4 
117.5 

157.3 
39.2 
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sacE 
sacF 



SacE 
SadP 



sacG SacG 

sacH SacH 

sad Sad 

sacJ SacJ 



Unknown 
L-Tyr derivative 
methylase 
L-Tyr O-methylase 
Resistance protein 
methyl-transferase 
monooxygenase 
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14719.14901 
14962-16026 

16115-17155 
17244-17783 
2513-1854 
1861-355 
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61 
355 

347 
180 
220 
509 



6.7 
39.8 

38.3 
19.6 
24.2 
55.3 



The safracin putative synthetic pathway, with indications of the specific 
amino acid substrates used for each condensation reaction and the 
various post-condensation activities, is shown in Pig, 5. 

To further evaluate the role of safracin biosynthetic genes, we 
constructed knock out mutants of each of the genes of the safracin cluster 
(Pig. 6). The disruption of the NRPSs genes {sacA, sacB and sacQ as well 
as sacD, sacF and sacG, resulted in safracin and P2 non producing 
mutants. Our results indicate that the genes from sacA to sacjFfare part of 
the same genetic operon. As a consequence of the sad and sacJ gene 
disruptions three new molectales were originated, P19B, P22A and P22B 
(Fig. 6). 




NH2 

P-19B P22A P22B 
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The production of P22A and P22B (Fig. 7a*) by sacJ mutant 
demonstrated that the role of the SacJ protein is to produce the additional 
hydro3cylation of the left L-Tyr derivatives amino acid of the safracin, the 
one involved in the quinone ring. The production of P19B (Fig. 7b*) by socT 
mutant, a safracin like molecule where the iV-methylation and the qioinone 
ring are missing, confirms that Sac! is the N-methyltransferase enayme 
and su^ests that socTJ is a transcriptional operon. The production of 
P19B also by sa<x7 mutant (Fig. 7a*) suggests that probably the N- 
methylation occurs after the quinone ring has been formed. Even though 
these new structures have no interesting antimicrobial activity on B. 
subtilis or no high citotoxic activity on cancer cells, they can serve as 
interesting new precursors for the hemisynthesis of new active molecules. 
As far as structure activity is concerned, the observation that P19B, P22A 
and P22B appear to loose their activity, suggests that the lost of the 
quinone ring from the safracin structure is directly related with the lost of 
activity of the safracin family molec\iles. 

The disruption of sad gene with the reconstitution of the sacJ gene 
expression resulted in the production of two new safracins. The two 
antibiotics produced, at levels of production as high as the levels of 
safracin A/ safracin B production in the wild type strain, have been named 
as safracm D and safracin E (Fig. 7c*). 




NHg 

SAFRACINA D 




NHz 

SAFRACINA E 



The safracin D and safracin E are safracin B and safracin A like 
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molecxiles, respectively, where the N-methylation is missing. Both, safracin 
D and safracin E have been shown to possess the same antibacterial and 
antitumoral activities as safracin B and safracin A, respectively. Apart 
from its high activities properties, antibacterial and antitumoral, safracin 
D coiild be used in the hemi-s3nithesis of the ecteinascidin ET-729, a 
potent antitumoral agent, as weU as in the hemi-synthesis of new 
ecteinascidins. 

A question arises concerning the role of the aminopeptidase-like protein 
coded by a gene located at 3'site of the safracin operon. The insertional 
inactivation of orfl (PM-Sl-14) showed no effect on safracin A/ safracin B 
production. Because of its functionality properties it remains imclear if 
this protein could play some role in the safracin metabolism. The other 
genes present in the pL30P cosmid (oTf2 to orf4) wiU have to be studied in 
more detail. 



Another aspect of the invention is that provides the tools necessary for 
the production of new specific designed "tmnatural" molecules. The 
addition of a specific modified P2 derivative precursor named P3, a 3- 
hydroxy-5-methyl-O-methyltyrosine, to the sacE mutant yields two 
Connatural" safracins that incorporated this specific modified precursor, 
safracin A(OEt) and safracin B(OEt) (Pig. 8). 




C30H40N4O6 

Exact Mass: 552,29 
Mol. WL: 552,66 
C, 65,20; H, 7,30; N. 10.14; 0. 17,37 




C30H40N4O7 
Exact Mass: 568,29 
MoL Wt: 568,66 
C. 63,36; H, 7,09; N. 9,85; 0, 19,69 



safracin A(OEt) 



safracin B(OEt) 
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The two new safracins are potent antibiotic and antitumoral 
compounds. The biological activities of safracin A(OEt) and Safracin B(OEt) 
are as potent as the activities of safracin A and safracin B, respectively. 
These new safracins coiild be the source for new potent antitumoral 
agents, as well as a source of molecules for the hemi-synthesis of new 
ecteinascidins. 



In addition, the genes involved in safracin synthesis could be combined 
with other non ribosomal peptide synthetases genes to result in the 
creation of novel "unnatural" drugs and analogs with improved activities. 



EXAMPLES 

Example 1; Extraction of nucleic aci d molecnte« from Ps^^^^»^ 
fluorescens A2-2 

Bacterial strains 

Strains of Pseudomonas sp. were grown at 27X in Luria-Bertani (LB) 
broth (Ausubel et al 1995, J. Wiley and Sons, New York, N.Y). E. coli 
strains were grown at 37»C in LB medium. Antibiotics were used at the 
following concentrations: ampicillin (50 ng/ml), tetracycline (20 ng/ml) and 
kanamycin (50 ng/ml). 



Table II. Strains used in this invention. 
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Code Genotype 



PM-S 1 -00 1 P. fluorescens A2-2 wild type 

PM-S 1-002 sacA- 

PM-S 1-003 sacB- 

PM-S 1-004 sacO 

PM-S 1-005 saa? 

PM-S 1-006 socf 

PM-S 1-007 sact with sacJ expression reconstitution 

PM-S 1-008 sacP- 

PM-S 1-009 saca 

PM-Sl- 010 sacD- 

PM-Sl- 014 orfZ- 

PM-Sl- 015 A2-2 + pLAFR3 

PM-Sl- 016 A2-2 + pLSOp 

PM- 1 9- 00 1 P. fluorescens CECT378 + pLAFR3 

PM- 19- 002 P. fluorescens CECT378 + pL30p 

PM- 19- 003 P. fluorescens CECT378 + pBBRl-MCS2 

PM- 19- 004 P. fluorescens CECT378 + pB5H83 

PM-19- 005 P. fluorescens CECT378 + pB7983 

PM-19- 006 P. fluorescens CECT378 + pBHPT3 

PM- 16- 00 1 P. aeruginosa CECTl 10 + pLAFR3 

PM- 1 6- 002 P. aeruginosa CECT 110 + pL30p 

PM- 17- 003 P. putida ATCC12633+ pBBRl-MCS2 

PM- 1 7- 004 P. putida ATCC 1 2633+ pB5H83 

PM- 17- 005 P. putida ATCC 1 2633+ pB7983 

PM- 1 8- 003 P. stutzeri ATCC 1 7588+ pBBR 1 -MCS2 

PM- 1 8- 004 P. stutzeri ATCC 1 7588+ pB5H83 

PM-18- 005 P. stutzeri ATCC17588+ pB7983 



DNA manipulation 

Unless otherwise noted, standard molecular biology techniques for in 
vitro DNA manipulations and cloning were used (Sambrook et al 1989, 
Cold Spring Harbor, NY: Cold Spring Harbor Laboratory). 
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DNA extraction 

Total DNA from Pseudomonas fluorescens A2-2 ciiltures was 
prepared as reported (Sambrook et al 1989, Cold Spring Harbor, NY: Cold 
Spring Harbor Laboratory). 

Computer analysis 

Sequence data were compiled and analysed using DNA-Star software 
package. 

Example 2; Identification of NRPS genes resp o nsible for safracin 
production in Pseudomonas fluorescens A2-2. 

Primer design 

Marahiel et al. (Marahiel et al Chem^ Rev. 1997, 97, 2651-2673) 
previously reported highly conserved core motifs of the catalytic doinains 
of cycUc and branched peptide synthetases. Based on multiple sequence 
aHgnments of several reported peptide synthetases the conserved regions 
A2, A3, A5, A6, A7 and AS of adenylation and T of thiolation modules were 
targeted for the degenerate primer design (Turgay and Marahiel, Peptide 
Res. 1994, 7, 238-241). The wobble positions were designed in respect to 
codon preferences within the selected modules and the expected high G/C 
content of Pseudomonas sp, AU oligonucleotides were obtained from 
ISOGEN (Bioscience BV). A PGR fragment was obtained when degenerate 
oligonucleotides derived from the YGPTE (A5 core) and LGGXS (T core) 
sequences were used. These oHgonucleotides were denoted PS34-YG and 
PS6-FF, respectively. 
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Table III. PGR primers designed for this study. 



Primer designation 
and 

orientation 


Sequence 


Length 


PS34-YG (forward) 


5'- TAYGGNCCNACNGA -3' 


14-mer 


PS6-FF (reverse) 


5'-TSNCCNCCNADNTCRAARAA-3' 


20-iner 



PCR conditions for amplification ofDNA from P. fluorescens A2-2 

A fragment internal to nonribosomal peptide S5nithetases (NRPS) was 
amplified using PS-34-YG and PS6-FF oUgonucleotides and R fluorescens 
A2-2 chromosomal DNA as template. Reaction buffer and Taq polymerase 
from Promega were used. The cycling profile performed in a Personal 
thermocycler (Eppendorf) consists on: 30 cycles of 1 min at 95°C, 1 min at 
sec, 2 min at 72'C. PCR products were on the expected size (750 bp 
aprox.) based on the location of the primers within the NRPS domains of 
other synthetase genes. 



DNA cloning 

PCR amplification fragments were cloned into pGEM-Teasy vector 
according to the manufacturer (Qiagen, Inc., Valencia, CA). In this way, 
cloned fi-agments are flanked by two EcoRI restriction sites, in order to 
facilitate subsequent subclonig in other plasmids (see below). Since NRPSs 
enzymes are modular, clones from the degenerated PCR primers 
represents a pool of fragments from different domains. 



DNA sequencing 

AU sequencing was performed using primers directed against the 
cloning vector, with an ABI Automated sequencer (Perkin-Elmer). Cloned 
DNA sequences were identified using the BLAST server of the National 
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Center for Biotechnology Information accessed over the Internet (Altschul 
et al., Nucleic Adds Res. 1997, 25, 3389-3521). All of the sequences have 
signature regions for NRPSs and show high similarity in BLAST searches 
to bacterial NRPS showing that they are in fact of peptide origin. Moreover, 
a probable domain similarity search was performed using the PROSITE 
(European Molecular Biology Laboratory, Heidelberg, Germany) web server.. 

Gene disruption of Pseudomonas fluorescens A2'2 

In order to analyse the function of the genes cloned, these genes 
were disrupted through homologous recombination (Fig. 9). For this 
purpose, recombinant plasmids (pG-PS derivatives) harbouring the NRPS 
gene fragment were digested with BcoRL restriction enzyme. The resulting 
fragments belonging to the gene to be mutated were cloned into the 
pKlSmob mobilizable plasmid (Schafer et al Gene 1994, 145, 69-73), a 
chromosomal integrative plasmid able to replicate in jB. coli but not in 
Pseudomonas strains. Recombinant plasmids were introduced first in B. 
coli S17-X,PIR strain by transformation and then in P. fluorescens A2-2 
through biparental conjugation (Herrero et al, JBacteriol 1990, 172, 6557- 
6567). Different dilutions of the conjugation were plated onto LB solid 
medium containing ampiciUim plus kanamycin and incubated overnight at 
27°C. Kanamycin-resistant transconjugants, containing plasmids 
integrated into the genome via homologous recombination, were selected. 

Biological assay (biotest)for safracin production 

Strains P. fluorescens A2-2 and its derivatives were incubated in 50 
ml baffled erlenmeyer flasks containing fermentation medium with the 
corresponding antibiotics. Initially, SA3 fermentation medium was used 
(Ikeda Y. Jl Ferment Technol 1985, 63, 283-286), In order to increase the 
productivity of the fermentation process statistical-mathematical methods 
like Plackett-Burman designed was used to select nutrients and response 
surface optimisation techniques were tested (Hendrix C. Chemtech 1980, 
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10, 488-497) in order to determine the optimum level of each key 
independent variable. Experiments to improve the ciilture conditions like 
incubation temperature and agitation have also been done. Finally a highly 
safracin B producer medium named 16B (152 g/1 of mannitol, 35g/l of 
G20-25 yeast, 26 g/1 of CaCOa , 14 g/1 of ammoniiim sulphate, 0.18 g/1 of 
ferric chloride, pH 6.5) was selected. 

The safracin production was assay testing the capacity of inhibition 
a Bacillus subtilis solid culture by 10 \il of the supernatant of a 3 days 
Pseudomohas sp. culture incubated at 27°C (AHjah et al. Appl Microbiol 
Biotechnol 1991, 34, 749-755). P. fluorescens A2-2 cultures produce 
inhibition zones of 10-14 mm diameter while non-producing mutants did 
not inhibit B. subtilis growth. Three isolated clones had the safracin 
bios5nathetic pathway affected. In order to confirm the results, HPLC 
analysis of safracin production was performed. 

HPLC analysis of safracin productton. 

The supernatant was analysed by using HPLC Symmetry C-18. 
300A, 5 [XTD. , 250 X 4.6 mm column (Waters) with guard-column 
(Symmetry C-18, 5Mm 3.9 x 20 mm. Waters). An ammonium acetate buffer 
(10 mM, 1% Diethanolamine, pH 4.0)- acetonitrile gradient was the mobile 
phase. Safracin was detected by absorption at 268 nm. In Fig. 6, HPLC 
profile of safracin and safracin precursors produce by P. fluorescens A2-2 
strain and different safracin-like structiires produced by P. fluorescens 
mutants are shown. 



Example 3. Cloning and seanence analysis of safracin cluster 



Inverse PCR and phage library hybridisation 
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Southern hybridisation on mutant chromosomal DNAs verified the 
correct gene disruption and demonstrated that the peptide synthethase 
fragment cloned into pKlSmob plasmid was essential for the production of 
safracin. Analysis of the non safracin producers mutants obtained 
demonstrated that all of them had a gene disruption into the same gene, 
sacA. 

Inverse PGR from genomic DNA and screening of a phage Ubraiy of 
P. fluorescent A2-2 genomic DNA revealed the presence of additional genes 
flanking sacA gene, probably involved in safracin biosynthesis. 

The GenBank accession number for the nucleotide sequence data of 
the P. fluorescens A2-2 safracin biosynthetic cluster is AY061859. 

Cosmid library construction and heterologous expression 

To determine whether safracin cluster was able to confer safracin 
biosynthetic capabiUty to a non producer strain, it was cloned into a wide 
range cosmid vector (pLAFR3, Staskawicz B. et al J Bacterial 1987, 169, 
5789-5794) and conjugated to a different Pseudomonas sp coUection 
strains. 

To obtain a clone containing the whole cluster, a cosmid Ubraiy was 
constructed and screened. For this purpose, chromosomal DNA was 
partially digested with the restriction enzyme Psfl, the fragments were 
dephosphoiylated and Ugated into the Psfl site of cosmid vector pLAFRS. 
The cosmids were packaged with Gigapack III gold packaging extracts 
(Stratagene) as manufacturer's recommendations. Infected cells of strain 
XLl-Blue were plated on LB-agar supplemented with 50 ng/ml of 
tetracycline. Positives clones were selected using colony hybridization with 
a DIG-labeled DNA fragment belonging to the 3'-end of the safracin cluster. 
In order to ensure the cloning of the whole cluster, a new colony 
hybridization with a 5'-end DNA fragment was done. Only cosmid pL30p 
showed multiple hybridizations with DNA probes. To confirm the accurate 
cloning, PGR amplification and DNA-sequencing with DNA oligonucleotides 
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belonging to the safracin sequence were carried out. The size of the insert 
of pL30P was 26,705 bp . The pL30p clone DNA was transformed into E. 
coli S17A,PIR and the resulting strain were conjugated with the 
heterologous Pseudomonas sp. strains. The pL30p cosmid was introduced 
into P. ftuorescens CECT378 and P. aeruginosa CECTl 10 by biparental 
conjugation as described above. Once a clone encoding the whole cluster 
was identified, it was determined whether the candidate was capable of 
producing safracin. Safracin production in the conjugated strains was 
assessed by HPLC analysis and biological assay of broth cultures 
supematants as previously described. 

The strain P. ftuorescens CECT378 expressing the pLSOp cosmid 
(PM- 19-002) was able to produce safracin in considerable amoionts, 
whereas safracin production in P. aeruginosa CECTllO strain expressing 
pLSOP (PM- 16-002) was 10 times less than the CECT378. Safracin 
production in these strains was about 22 % and 2 % of the total 
production in comparison with the natural producer strain. 

Genes involved in the formation of safraan. Sequence analysis of 
sacABCDEFGH and sacIJ operons 

Computer analyses of the DNA sequence of pL30P revealed 14 ORFs 
(Pig. 1). A potential ribosome binding site precedes each of the ATG start 
codons. 

In the sacABCDEFGH operon, three veiy large ORFs, sacA, sacB and 
sacC (positions 3052 to 6063, 6080 to 9268 and 9275 to 13570 of the P. 
ftuorescens A2-2 safracin sequence SEQ ID NO:l, respectively) can be read 
in the same direction and encode the putative safracin NRPSs: SacA (1004 
amino acids, Mr 110452), SacB (1063 amino acids, Mr 117539) and SacC 
(1432 amino acids, Mr 157331). The three NRPSs genes contain the 
domains resembling amino acid activating domains of known peptide 
synthetases. SpecificaUy, the amino acid activating domains from these 
NRPS genes are very similar to three of the four amino acid activating 
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domains (Gly, Tyr and lyr) foirnd in the Myxococxus xanthus saframycin 
NRPSs (Pospiech et al. Microbiology 1995, 141, 1793-803; Pospiech et al. 
Microbiol. 1996, 142, 741-746). In particular, SacA (SEQ ID NO:2) shows 
33% identity with saframycin Mxl S3nithetase B protein (SafB) from M. 
xanthus (NCBI accession niimber U24657), whereas SacB (SEQ ID NO:3) 
and SacC (SEQ ID NO:4) share, respectively, 39% and 41% identity with 
saframycin Mxl synthetase A (SaiA) from M. xanthus (NCBI accession 
number U24657). The Pig. 2 shows a comparison among SacA, SacB y 
SacC and the different amino acid activating domains of saframycin NRPS. 

Downstream sacC five small ORFs reading in the same direction as 
the NRPSs genes exist (Fig.l). The first one, sacD (position 13602 to 14651 
of P. fluorescens A2-2 safracin sequence), encodes a putative protein, SacD 
(350 amino acids, Mr 39187; SEQ ID NO:5), with no simUarities in the 
GeneBank DB. The next one, sacB (position 14719 to 14901 of P. 
fluorescens A2-2 safracin sequence), encodes a small putative protein 
called SacE (61 amino acids, Mr 6729; (SEQ ID NO:6)), which shows some 
similarity with proteins of unknown function in the databases (ORFl from 
Streptomyces viridochromogenes (NCBI accession number Y 17268; 44% 
identity) and MbtH from Mycobacterium tuberculosis (NCBI accession 
momber Z95208; 36% identity). The third ORF, sacF (position 14962 to 
16026 of P. fluorescens A2-2 safracin sequence), encodes a 355-residue 
protein with a moleciilar weigh calcxilated of 39,834 (SEQ ID NO:7). This 
protein most closely resembles hydroxyneurosporene methyltransferase 
(CrtF) from Chloroflexus aurantiacus (NCBI accession niimber AF288602; 
25% identity). The nucleotide sequence of the fourth ORF, sacG (position 
16115 to 17155 of P. fluorescens A2-2 safracin sequence), predicted a gene 
product of 347 amino acids having a molecular mass of 38,22 kDa (SEQ 
ID NO:8). The protein, called SacG, is similar to bacterial O- 
methyltransferases, including O-dimethylpuromycin-O-methyltransferase 
(DmpM) from Streptomyces armlatus (NCBI accession number P42712; 
31% identity). A computer search also shows that this protein contains the 



wo 2004/056998 




PCT/GB2003/005563 



three sequence motifs foiind in diverse S-adenosylmethionine-dependent 
meth3rtransferases (Kagan and Clarke, Arch, Biochem. Biophys. 1994, 310, 
417-427). The fifth gene, sacH (position 17244 to 17783 of P. fluorescens 
A2-2 safracin sequence), encodes a putative protein SacH (180 amino 
acids, Mr 19632; (SEQ ID NO:9). A computer search for similarities, 
between the deduced amino acid sequence of SacH and other protein 
sequences, revealed identity with some conserved hypothetical proteins of 
imknown function, which contains a well conserved transmembrane motif 
and a dihydrofolate reductase-like active site (Conserved hypothetical 
protein from Pseudomonas aeruginosa PAOl, NCBI accession number 
P3469; 35% identity). 



Upstream sacABCDEFGH operon, reading in opposite sense, a two 
genes operon, sacIJ, is located. The sad gene (position 2513 to 1854) 
encodes a 220-amino acids protein (Mr 24219; (SEQ ID NO: 10) that most 
closely resembles ubiquinone/manequinone methyltrasnferase from 
Thermotoga maritime (NCBI accession niimber AE001745; 32% identity). 
The sax3J gene (position 1861 to 335) encodes a 509-amino acid protein 
(SEQ ID NO: 11), with a molecular mass of 55341 Da, similar to bacterial 
monooxygenases/hydroxylases, including putative monooxygenase from 
Bacatus subtOis (NCBI accession number Y14081; 33% identity) and 
Streptomyces coeZicoZor (NCBI accession momber AL109972; 29% identity). 

SacABCDEFGH and scwIJ operons are transcribed divergently and 
are separated by 450 bp approximately. Both operons are flanked by 
residual transposase frsigments. 



Related safracin cluster genes 

A putative ORF {orfl; position 18322 to 19365 of P. fluorescens A2-2 
safracin sequence) located at the 3'-end of the safracin sequence has been 
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found (Pig. 1). ORFl protein {SEQ ID NO: 12) shows similarity with 
aminopeptidases from the Gene Bank DataBase (peptidase M20/M25/M40 
family from Cctulobacter crescentus CB15; NCBI accession nvimber 
NP422131; 30% identity). Using the strategy described in Example 2, the 
gene disruption of orfl do not affect safracin production in P. ftuorescens 
A2-2. 

At the 3'-end of the safracin sequence cloned in pL30p cosmid, three 
putative ORFs (orf2, orf3 and orf4), were fovmd. Reading in opposite 
direction than sacABCDJSFGff operon, or/2 gene (position 22885 to 21169 
of SEQ ID NO:l) codes for a protein, ORF2 (SEQ ID NO: 13), with 
similarities to Aquifex aeolicus HoxX sensor protein (NCBI accession 
number NC000918.1; 35% identity), whereas otf3 gene (position 23730 to 
23041 of SEQ ID NO:l) codes for ORF3 protein (SEQ ID NO: 14) which 
shares 44% identity with a glycosU transferase related protein from 
Xanthomonas axonopodis pv. Citri str. 306 (NCBI accession number 
NP642442). 

The third gene is located at the 3'-end of SEQ ID NO:l (position 
25037 to 26095). This gene, named orf4 (position 2513 to 1854), encodes a 
protein, ORF4 (SEQ ID NO: 15), that most closely resembles to a 
hypotiietical isochorismatase family protein YcdL from Escherichia coll 
(NCBI accession number P75897; 32% identity). 

Presumably, tiiese three genes would not be involve in the safracin 
biosyntiietic patiiway, however, future gene disruption of these genes will 
confirm this assumption. 

The different DNA sequences foiind are Usted at the end of the 
description. 



Earample 4. Functional analysis of the safracin loci and search for 
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Since the pathway for synthesis of safracin in P. fluorescens A2-2 is 
at present unknown, the inactivation of each of the genes described in 
Example 3 woiild permit fundamental studies on the mechanism of 
safracin biosynthesis in this strain. 

In order to analyze the functionality of each particular protein in the 
safracin production pathway, disruption of each particular gene of the 
cluster, but sax^, was performed. All of the genetic mutants were obtained 
following the disruption strategy previously described. 

Figure 6 is a sxmimary of the different mutants constructed in this 
invention as weU as a summary of the compounds produced by the 
mutants as a consequence of the gene disruption. In the wild type strain 
both safracin A and B and other compounds, P2 and P14, were clearly 
detected by HPLC (see Pig. 6,WT). The gene disruption of the sacA (PM-Sl- 
002), sacB (PM-Sl-003), sacC (PM-Sl-004), sacD (PM-Sl-010), sacF (PM- 
Sl-008), and sacG (PM-Sl-009), genes generated mutants that were 
imable to produce neither safracin A and safracin B, nor the prectirsor 
compoimds with retention times beneath 15 min, P2 and P14 respectively. 
The structure elucidation of P14 and P2 revealed that P14 is a 3-methyl-O- 
methyl tyrosine, where as P2 is a 3-hydroxy-5-methyl-Omethyl tyrosine. 
Because of the small size of the sacB gene, the sacE- mutant was' not 
possible to be obtained by gene disruption, but deletion of this gene is in 
process. The overexpression of SacE protein, in trans, had no effect on 
safracin B/A production. The sacf mutants (PM-Sl-006) produced P2, P14 
and significant amoimt of a compoimd called P19B (Fig. 6; FigTb*). 
Structure elucidation of P19B revealed that this compound is a safracin- 
like molecule in which the JV-Met and one of the OH from the quinone ring 
are missing. In the sacJ' mutants (PM-Sl-005), P2, P14, P19B and two 
new compounds called P22A and P22B were obtained (Fig. 6; Fig. 7a*). 
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Structure elucidation of P22A and P22B revealed that they are safracin A 
and safracin B like molecioles, respectively, without one of the -OH group 
from the quinone ring. The biological assay of the sad- and the sacJ- 
mutants extracts revealed very low activity against Bacillus subtilis. 

The disruption of sad gene with the reconstitution of the saoJ gene 
e^^ression resulted in a new safracins producer mutant, PM-S 1-007. The 
two antibiotics produced, at levels of production as high as the levels of 
safracin A and safracin B in the wild type strain, have been named as 
safracin D and safracin E (Fig. 7c*). The safracin D and safracin E are 
safracin B and safracin A like molecules, respectively, where the N- 
methylation is missing. 

These results strongly suggest that i) sacA, sacB and sacC genes 
encode for the safracin NRPSs; ii) sacD, sacF and sacG genes are 
responsible for the transformation of L-Tyr into the L-Tyr derivative P2 and 
iii) sad and saaJ are responsible for the tailoring modifications that 
convert P19 and P22 into safracin. 



Characterization of Natural Precursors : 
P-14 



OMe 




C11H15NO3 
Exact Mass: 209,11 
Mol. Wt: 209^4 
C. 63.14; H. 7.23; N, 6,69; O. 22,94 
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Strain: 

PseudomonasfluorescensA2-2 (wild type) (PM-S 1-001) 

Fermentation conditions: 

Seed medium YMP3 containing 1% glucose; 0.25% beef extract; 0.5% 
bacto-peptone; 0.25% NaCl; 0,8% CaC03 was inoculated with 0.1% of a 
frozen vegetative stock of the microorganism, and incubated on a rotary 
shaker (250 rpm) at 27°C. After 30h of incubation, the 2% (v/v) seed 
culture was transferred into 2000 ml Erlenmeyer flasks containing 250 ml 
of the M-16B production mediiam, composed of 15.2 % mannitol; 3.5 % 
Dried brewer's yeast; 1.4 % (NH4)2 SO4; 0.001%; FeCla; 2.6 % COsCa. The 
temperature of the incubation was 27°C from the inoculation till 40 hours 
and then, 24°C to final process (71 hours). The pH was not controlled. The 
agitation of the rotatory shaker was 220 rpm with 5 cm eccentricity. 

Isolation: 

After 71 hoiirs of incubation, 2 Erlenmeyer flasks were pooled and 
the 500 ml of fermentation broth was clarified by 7.500 rpm centrifugation 
during 15 minutes. 50 grams of the resin XAD-16 (AmberUte) were added 
to the supernatant and mixed during 30 minutes at room temperature. 
Then, the resin was recovered from the clarified broth by filtration. The 
resin was washed twice with distilled water and extracted with 250 ml of 
isopropanol (2-PrOH). The alcohol extract was dried under high vacuum 
till obtention of 500 mg crude extract. This crude was dissolved in 
methanol and purified by chromatographic column using Sephadex LH-20 
and methanol as mobile phase. The P-14 compoimd was eluted and dried 
as a 15 mg yellowish soHd. The purity was tested by analytical HPLC and 
iH NMR. 
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P-14 was also isolated in a similar way from ctiltures of the sacJ- mutant 
(PM-S 1-005), using semipreparative HPLC as the last step in the 
purification process. 

Biological activities: 
NO ACTIVE 

Spectroscopic data: 

ESMS m/z254 (CuHi4N03Naa+), 232 (CuHi5N03Na+), 210 (M+H+). iH RMN 
(300 MHz, CD3OD): 7.07 (d, c^S.l Hz, H-9), 7.06 (s, H-5), 6.84 (d, J=8.1 
Hz, H-8), 3.79 (s, H-11), 3.72 (dd, J=8.7, 3.9 Hz, H-2), 3.20 (dd, J=14.4, 
3.9 Hz, H-3a), 2.91 (dd, .^=14.4, 8.9 Hz, H-3b), 2.16 (s, H-10). i3C RMN (75 
MHz, CD3OD): 174.1 (C-1), 158.6 (C-7), 132.5 (C-5), 128.9 (C-9), 128.5 (C- 
4), 128.0 (C-6), 111.4 (C-8), 57.6 (C-2), 55.8 (C-11), 37.4 (C-3), 16.3 (C-10) 



P-2 



OMe 




NH2 



C11H15NO4 
Exact Mass: 225.10 
Md. Wt: 225,24 
C, 58,66; H, 6.71; N. 6.22; 0. 28.41 



Strain: 

Pseudomonas fluorescens A2-2 (wild type) (PM-Sl-001) 
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Fermentation conditions: 

The same process than P-14 

Isolation: 

Similar procedure as the P-14, except in the Sephadex 
chromatography, where the fractions containing P-2 have eluted later. A 
semi-preparative HPLC step (Symmetry Prep C-18 colunm, 7.8 x 150 mm, 
ACONH4 10 mM pH=3/CH3CN 95:5 held for 5 min and then gradient from 
5 to 6.8 % of CH3CN in 3 min) has been necessaiy to purify the P-2. 
Also this compound has been isolated from the fermentation broth of the 
Pseudomonas putida ATCC12633+pB5H83 (PM- 17-004) as result of 
heterologous expression. 

Biological activities: 
NO ACTIVE 

Spectroscopic data: 

ESMS m/z 226 [M+H]+; iH RMN (CD3OD, 300 MHz): 6.65 (d, J= 1.8 Hz, 
H-5), 6.59 (d, J= 1.8 Hz, H-9), 3.72 (s, H-11), 3.71 (dd, J= 9.0, 4.2 Hz, H- 
2), 3.16 (dd, 14.4, 4.2 Hz, H-3a), 2.83 (dd, J= 14 A, 9.0 Hz, H-3b), 2.22 
(s, H-10); RMN (DMSO, 75 MHz): 170.88 (s, C-1), 150.025 (s, C-7), 
144.56 (s, C-8), 132.28 (s, C-4), 130.36 (s, C-6), 121.73 (d, C-5), 115.55 (d, 
C-9), 59.06 (q, 7-OMe), 55.40 (d, C-2), 36.21 (t, C-3), 15.86 (q, 6-Me). 

Charact erization of Safracin s lik-R r ompoLmds obtained by knnr> nnt 



COMPOUND P-22B 
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OMe 




NH2 



C28H38N4O6 
Exact Mass: 526,28 
Mol. Wt.: 526,62 
C, 63,86; H, 7,27; N. 10,64; O, 18,23 



Strain: 

sac J" mutant from P.fluorescens A2-2 (PM-S 1-005) 

Fermentation conditions: 

50 Uters of the SAM-7 medium (50 1) composed of dextrose (3.2%), 
mannitol (9.6%), diy brewer's yeast (2%), ammonixmi sxilphate (1.4%), 
potassium secondary phosphate (0.03%), potassium chloride (0.8%), Iron 
ail) chloride 6-hydrtate (0.001%), Lrtyrosine (0.1%), calcium carbonate 
(0.8%), poly- (propylene glycol) 2000 (0.05%) and antifoam ASSAF 1000 
(0.2%) was poured into a jar-fermentor (Bioengineering LP-351) with 75 1 
total capacity and, after sterilization, sterile antibiotics (amplicillin 0.05 g/1 
and kanamycin 0.05 g/1) were added. Then, it was inoculated with seed 
cLdture (2%) of the mutant strain PM-Sl-005. The fermentation was 
carried out during 71 h. under aerated and agitated conditions (1.0 
1/1/min and 500 rpm). The temperature was controUed from 27*>C (from 
the inoculation tiU 24 hours) to 25'>C (from 24h to final process). The pH 
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was controlled at pH 6.0 by automatic feeding of diluted sulphuric acid 
from 22 hours to final process. 

Isolation 

The whole broth was clarified (Sharpies centrifuge). The pH of the 
clarified broth was adjusted to pH 9.0 by addition of NaOH 10% and 
extracted with 25 litres of ethyl acetate. After 20' mixing, the two phases 
were separated. The organic phase was frozen overnight and then, filtered 
for removing ice. and evaporated to a greasy dark green extract (65.8 gj. 
This extract was mixed with 500 ml hexane (250 ml two times) and filtered 
for removing hexane soluble impurities. The remaining solid, after diying, 
gave a 27.4 g of a dry green-beige extract. 

This new extract was dissolved in methanol and piirified by a Sephadex 
LH-20 chromatography (using methanol as mobile solvent) and the 
safracins-like compovmds were eluted in the central fractions {Ancdyzed on 
TLC conditions: Silica normal phase, mobUe phase: BtOAc:MeOH 5:3. Aprox. 
Rf valor: 0.3forP-22B, 0.25 P-22A and O.lforP-19). 

The pooled fractions, (7,6g) containing the three safi-acin-like compound 
were purified by a Silica coliimn using a mixture of EtOAcrMeOH from 
50:1 to 0:1. and other chromatographic system (isocratic 
CHCl3:MeOH:H20:AcOH 50:45:5:0.1). Compounds P22-A, P22-B and P19- 
B were purified by reversed-phase HPLC (SymmetiyPrep C-18 column 150 
X 7.8 mm, 4 mL/min, mobile phase: 5 min MeOH:H20 (0.02 % TFA) 5:95 
and gradient from MeOH:H20 (0.02 % TFA) 5:95 to MeOH 100 % in 30 
min). 



Biological activities ofsafracm P-22B 
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Antimicrobial activity: On solid medium 

Bacillus subtais. lOjig/disk (6mm diameter): 10 mm inhibition zone 
Spectroscopic data: 

HRFABMS m/z 509.275351 [M-H2O+H]* (calcd for C28H37N4O5 509.276396 
A 1.0 mmu); LRFABMS using m-NBA as matrix m/z (rel intensity) 509 [M- 
H20+Hr (5), 460 (2.7), 391 (3). 

»H NMR (CD3OD, 500 MHz): 6.70 (s, H-15), 6.52 (s, H-5), 4.72 (bs, H-11), 
4.66 (d, J= 2.0 Hz, H-21), 4.62 (dd, J= 8.4, 3.7 Hz, H-1), 3.98 (bd, J= 7.6 
Hz, H-13), 3.74 (s, 7-OMe), 3.71 (s, 17-OMe), 3.63 (m, overlapped signal, 
H-25), 3.62 (m, overlapped signal, H-3), 3.30 (m, H-22a), 3.29 (m, H-14a), 
3.18 (d, J-= 18.6 Hz, H-14b), 2.90 (m, H-4a), 2.88 (m, H-22b), 2.76 (s, 12- 
NMe), 2.30 (s, 16-Me), 2.22 (m, H.4b), 1.16 (d, J= 7.4 Hz, H-26); 

13C NMR (CD3OD, 125 MHz): 170.75 (s, C-24), 149.24 (s, C-18), 147.54 (s, 
C-8), 145.95 (s, C-7), 145.82 (s, C17), 133.93 (s, C-16), 132.31 (s, C-9), 
131.30 (s, C-6), 128.95 (s, C-20), 121.93 (d, C-15), 121.76 (d, C-5), 121.44 
(s, C-10), 112.45 (s, C-19), 92.87 (d, C-21), 60.86 (q, 7-OMe), 60.76 (q, 17- 
OMe), 59.39 (d, C-11), 57.96 (d, C-13), 55.51 (d, C-1), 54.29 (d, C-3). 50.08 
(d, C-25), 45.55 (t, C-22), 40.43 (q, 12-NMe), 32.56 (t, C-4), 25.84 (t, C-14), 
17.20 (q, C-26), 16.00 (q, 16-Me), 15.81 (q, 6-Me). 
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COMPOUND P-22A 



MeO 




Strain: 

The same as for P-22B 



NH2 



Fermentation conditions: 
The same as for P-22B 

Isolatioru 

The same as for P-22B 



Biological activities ofsafracin P-22A 
Antitumor activities 




Antimicrobial activity: On solid mediiim 

Bacillus subtilis. lOjig/disk (6mm diameter): NO ACTIVE 
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Spectroscopic data: 

HRFABMS m/z 511.290345 [M+H]+ (calcd for C28H39N4O5 511.292046 A 
1.7 mmu); LRFABMS using m-NBA as matrix m/z (rel intensity) 511 
[M+H]^ (61), 409 (25), 391 (4); m NMR (CD3OD, 500 MHz): 6.68 (s, H-15), 
6.44 (s, H-5), 3.71 (s, 7-OMe), 3.67 (s, 17-OMe), 2.72 (s, 12-NMe), 2.28 (s, 
16-Me), 2.20 (s, 6-Me), 0.87 (d, J= 7.1 Hz, H-26); 



COMPOUND P-19B 

OMe 




NH2 



Strain: 

The same as for P-22B 

Fermentation conditions: 
The same as for P-22B 



Isolation 

The same as for P-22B 
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Biological activities ofsafracin P'19B 
Antitumor activitiftga 
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Antimicrobial activity- On solid medium 

Bacillus subtilis. lOng/disk (6mm diameter): NO ACTIVE 



Spectroscopic data: 

HRFABMS m/^ 495.260410 [M-H2O+H]* (calcd for C27H35N4O5 495.260746 
A 0.3 mmu); LRFABMS using m-NBA as matrix m/z (rel intensity) 495 [M- 
H20+Hr (13), 460 (3), 391 (2); iH NMR (CD3OD, 500 MHz): 6.67 (s, H-15), 
6.5 (s, H-5), 3.73 (s, 7-OMe), 3.71 (s, 17-OMe), 2.29 (s, 16-Me), 2.24 (s, 6- 
Me), 1.13 (d, 7.1 Hz, H-26); 
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New Safracin compounds o htainftrf bv knock out 
SAFRACIN D 



OMe 
HO^ Jv^^Me 




C27H34N4O7 
Exact Mass: 526,24 
Mol. Wt.: 526,58 
C. 61.58; H, 6,51; N. 10,64; O, 21,27 



Strain: 

sax: f- with sacJ expression reconstitution from P.fbiorescens A2-2 (PM-Sl- 
007) 

Fermentation conditions: 

50 litres of the SAM-7 mediiim (50 1) composed of dextrose (3.2%), 
mannitol (9.6%), dry brewer's yeast (2%), ammonium sulphate (1.4%), 
potassium secondary phosphate (0.03%), potassium chloride (0.8%), Iron 
(III) chloride 6-hydrtate (0.001%), L-tyrosine (0.1%), calcium carbonate 
(0.8%), poly- (propylene glycol) 2000 (0.05%) and antifoam ASSAF 1000 
(0.2%) was poured into a jar-fermentor (Bioengineering LP-351) with 75 1 
total capacity and, after steriliasation, sterile antibiotics (amplicillin 0.05 g/1 
and kanamycin 0.05 g/1) were added. Then, it was inocxilated with seed 
culture (2%) of the mutant strain PM-S 1-007. The fermentation was 
carried out during 89 h. under aerated and agitated conditions (1.0 



wo 2004/0S6998 




PCT/GB2003/005563 



1/1/ min and 500 rpm). The temperature was controlled from 27°C (from 
the inoculation tiU 24 hours) to 25'»C (from 24h to final process). The pH 
was controUed at pH 6.0 by automatic feeding of diluted sulphuric acid 
from 27 hours to final process. 



Isolation: 

The cultured mediimi (45 1) thus obtained was, after removal of cells by 
centriftigation, adjusted to pH 9.5 with dUuted soditim hydroxide, 
extracted with 25 Hter of ethyl acetate twice. The mixture was carried out 
into an agitated-vessel at room temperature for 20 minutes. The two 
phases were separated by a Uquid-liquid centrifuge. The organic phases 
were frozen at -20''C and filtered for removing ice and evaporated untH 
obtention of a 35g. oil-dark-crude extract. After a 5 1. hexane triturating, 
the extract (12.6g) was purified by a flash-chromatographic column (5.5 
cm diameter, 20 cm length) on silica-normal phase, mobUe phase: Ethyl 
acetate: MeOH: 1 L of each 1:0; 20:1; 10:1; 5:1 and 7:3. 250 ml- fractions 
were eluted and pooled depending of the TLC (SiKca-Normal, EtOAc:MeOH 
5:2, Safracin D Rf 0.2, safracin E 0.05). The fraction containing impure 
safracin D and E was evaporated under high vacuum (2.2 g). An additional 
purification step was necessary to separate D and E on similar conditions 
(EtOAc:MeOH from 1:0 to 5:1), from this, the fractions containing safracin 
D and E are separate and evaporated and further purification by Sephadex 
LH-20 column chromatography eluted with methanol. 

The safracins D and E obtained were independent precipitated from 
CH2CI2 (80 ml) and Hexane (1500 ml) as a green/yellowish-dried soUd (800 
mg safracin D) and (250 mg safracin E). 



Biological activities Sajracin D 
Antitumor screeninf a ;.- 
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Antimicrobisd activity: On solid medium 

Bacillus subtilis. lOng/disk (6mm diameter): Inhibition zone: 15 mm 
diameter 

Spectroscopic data 

ESMS: rn/z 509 [M-HzO+H]^; iH NMR (CDCI3, 300 MHz): 6.50 (s, C-15), 
4.02 (s, OMe), 3.73 (s, OMe), 2.22 (s, Me), 1.85 (s. Me), 0.80 (d, J= 7.2 Hz); 
"C NMR (CDCI3, 75 MHz): 186.51, 181.15, 175.83, 156.59, 145.09, 
142.59, 140.78, 137.84, 131.20, 129.01, 126.88, 121.57 (2 x C), 82.59, 
60.92, 60.69, 53.12, 21.40, 50.68, 50.22, 48.68, 40.57, 29.60, 25.01, 
21.46, 15.64, 8.44. 
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SAFRACIN E 



OMe 




NH2 



C27H34N4O6 
Exact Mass: 510,25 
MoLWt.: 510,58 
C, 63,51; H, 6,71; N, 10.97; 0, 18,80 

Strain: 

The same than safracin D 

Fermentation conditions: 
The same batch as safracin D 

Isolation: 

See safracin D conditions 



Biological activities Safracin E 
Antitimior screening: 




Antimicrobial activity: On solid medium 

Bacillus subtilis. lOug/disk (6mm diameter): 9.5 mm inhibition zone 



Spectroscopic data 

ESMS: m/z 511 [M+H]*; iH NMR (CDCI3, 300 MHz): 6.51 (s, C-15), 4.04 (s, 
OMe), 3.75 (s, OMe), 2.23 (s, Me), 1.89 (s. Me), 0.84 (d, J = 6.6 Hz); iSQ 
NMR (CDCI3, 75 MHz): 186.32, 181.28, 175.83, 156.43, 145.27, 142.75, 
141.05, 137.00, 132.63, 128.67, 126.64, 122.00, 120.69, 60.69, 60.21, 
59.12, 58.04, 57.89, 50.12, 49.20, 46.72, 39.88, 32.22, 25.33, 21.29, 
15.44, 8.23. 
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Example 5. CrOSS-feed^wpr *.Tgperim«>«»g 

Heterologous expression of safracin biosynthetic precursors genes for P2 
and PI 4 production 

In the attempt to shed light on the mechanism of the P2 and P14 
biosynthesis we have cloned and expressed the downstream NRPS genes to 
determine their biochemical activity. 

To overproduce P14, sacETOH genes were cloned (pB7983) (Pig. 4). 
To overproduce P2 in a heterologous system, sacD to sacH genes were 
cloned (pB5H83)(Fig. 4). For this purpose we PGR amplified fragments 
harboring the genes of interest using oligonucleotides that contain a Xbcd 
restriction site at the 5' end. OUgonucleotides PFSC79 (5'- 
CGTCTAGACACCGGCTTCATGG-3') and PFSC83 (5'- 

GGTCTAGATAACAGCCAACAAACATA-3') were used to amplify sacB to 
sacH genes; and oligonucleotides 5HPT1-XB (5*- 
CATCTAGACCGGACTGATATTCG-SO and PFSC83 (5'- 

GGTCTAGATAACAGCCAACAAACATA-3') were used to ampUfy sacD to 
sacH genes. The PGR fragments digested with Xbd were cloned into the 
Xbcd restriction site of the pBBRl-MGS2 plasmid (Kovach et al, Gene 
1994, 166, 175-176). The two plasmids, pB7983 and pB5H83, were 
introduce separately into three heterologous bacteria P. fluorescens (GEGT 
378), P. putida (ATGG12633) and P. stutzeri (ATGG 17588) by conjugation 
(see table II). When culture broth of the fermentation of the 
transconjugant strains was checked by HPLG anatysis, big amoimts of P14 
compound was visualized in the three strains containing pB7983 plasmid, 
whereas big amounts of P2 and some P14 product were observed when 
pB5H83 plasmid was expressed in the heterologa bacteria. 
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(>oss-feeding 

As it was shown in Example 4, the sacP (PM-S 1-008) and sacO- 
(PM-S 1-009) mutants were not able to produce neither safracins nor P2 
and P14 compounds. The addition of chemically synthesized P2 to these 
mutants during their fermentation jdelds safracin production. 

Moreover, the co-cultivation of an heterologous strain of P. stutzeri 
(ATCC 17588) harboring plasmid pB5H83 (PM-18-004), which expression 
produces P2 and P14, with either one of the two mutants sacP and sacG- 
resulted in safracin production. The co-cultivation of an heterologous 
strain P. stutzeri (ATCC 17588) harboring plasmid pB7983 (PM- 18-005), 
which expression produces only P14, with either one of the two P. 
fluorescent A2-2 mutants mentioned before resulted in no safracin 
production at all. These results suggest that P14 is transformed into P2, a 
molecule that can easily be transported in and out through the 
Pseudomonas sp. cell wall and which presence it is absolutely necessary 
for the biosynthesis of safracin. 

Example 6, Biological production of new "unnatural" molecules 

The addition of 2g/L of an specific modified P2 derivative precursor, 
P3, a 3-hydroxy-5-methyl-0-methyltyrosine, to the sacF mutant (PM-Sl- 
008) fermentation yielded two '"unnatural'' safracins that incorporated the 
modified precursor P3 in its structure, Safracin A(OEt) and Safracin 
B(OEt). 



SAFRACIN B-Btoxi (Safracin B (OBt)) 
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OEt 



EtO- 




C30H40N4O7 
Exact Mass: 568,29 
Mol. Wt: 568,66 
C, 63,36; H, 7,09; N, 9.85; 0. 19.69 



Strain 

saf F - mutant from P.ftuorescens A2-2 (PM-S 1-008) 
Fermentation cxtnditions: 

Seed medivim containing 1% glucose; 0.25% beef extract; 0.5% bacto- 
peptone; 0.25% NaCl; 0.8% CaCOS was inoculated with 0.1% of a frozen 
vegetative stock of the microorganism, and incubated on a rotary shaker 
(250 rpm) at 27°C. After 30h of incubation, the 2% (v/v) seed culture of the 
mutant PM-S 1-008 was transferred into 2000 ml Erlenmeyer flasks 
containing 250 ml of the M-16 B production medium, composed of 15.2 % 
maimitol; 3.5 % Dried brewer's yeast; 1.4 % (NH4)2 0.001%; FeCls; 2.6 % 
COsCa and 0.2% P3 (3-hydroxy-5-methyl-0-methyl1yrosine) The 
temperature of the incubation was 27°C from the inoculation till 40 hours 
and then, 24*'C to final process (71 hours). The pH was not controUed. The 
agitation of the rotatory shaker was 220 rpm with 5 cm eccentricity. 

Isolation 

4 X 2000/250 ml Erlermiqrer flasks were joined together (970 ml), 
centriiuged (12.000 rpm, 4«C, 10', J2-21 Centrifuge BECKMAN) to remove 



wo 2004/056998 



52 



PCT/GB2003/005563 



ceUs. The clarified broth (765 ml) was adjusted to pH 9.0 by NaOH 10%. 
Then, the alkali-clarified broth was extracted with 1:1 (v/v) EtOAc (x2). The 
organic phase was evaporated under high vacuvim and a greasy-dark 
extract was obtained (302 mg). 

This extract was washed by an hexane trituration for removing impurities 
and the solids were purified by a chromatography coliomn using Silica 
normal-phase and a mixture of Ethyl Acetate: Methanol (fir>m 12:1 to 1:1). 
The fractions were analyzed under UV on TLC (SiHca 60, mobile phase 
EtOAc:MeOH 5:4. Rf 0.3 (Safracin B-OEt and 0.15 Safiracin A-OEt). From 
this, safi^cins B OEt (25 mg) and safi-acin A OEt (20 mg) were obtained. 



Biological activities of safradn B (OEt) 
Antitumor activities 




Antimicrobial activity: On solid medium 

Bacillus subtais. lO^g/disk (6 mm diameter): 17,5 mm inhibition zone 



Spectroscopic data: 

ESMS: m/z 551 [M-HzO+H]^ NMR (CDCI3, 300 MHz): 6.48 (s, H-15), 
2.31 (s, 16-Me), 2.22 (s, 12-NMe), 1.88 (s, 6-Me), 1.43 (t, J= 6.9 Hz, Me- 
Etoxy), 1.35 (t, cr= 6.9 Hz, Me-EtojQr), 0.81 (d, J= 7.2 Hz, H-26) 
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SAPRACIN A-Btoxi (Safracin A (OBt)) 



OEt 




C30H40N4OB 
Exact Mass: 552,29 
Mol. Wt.: 552.66 
C, 65,20: H, 7,30; N. 10,14; 0, 17,37 



Strain: 

The same as for Safracin B (OEt) 

Fermentation conditions: 

The same as for Safracin B (OEt) 



Isolation: 

4 X 2000/250 ml Erlenmeyer flasks were joined together (970 ml), 
centrifuged (12.000 rpm, 4°C, 10^ J2-21 Centrifuge BECKMAN) to remove 
cells. The clarified broth (765 ml) was adjusted to pH 9,0 by NaOH 10%. 
Then, the alkali-clarified broth was extracted with 1:1 (v/v) EtOAc (x2). The 
organic phase was evaporated under high vacuimi and a greasy-dark 
extract was obtained (302 mg). 

This extract was washed by an hexane trituration for removing impurities 
and the solids were purified by a chromatography colimm using SiUca 
normal-phase and a mixture of Ethyl Acetate: Methanol (fi-om 12:1 to 1:1), 
The fractions were analysed under UV on TLC (SiHca 60, mobile phase 
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EtOAc:MeOH 5:4. Rf 0.3 Safracin B-OEt and 0.15 Safracin A-OEt). Prom 
this, safracins B OEt (25 mg) and safracin A OEt (20 mg) were obtained. 




Antimirr obial activity: On solid medium 

BaciUus subtilis. lOng/disk (6 mm diameter): 10 mm inhibition zone 
Spectroscopic data: 

ESMS: m/z 553 [M+H]*; iH NMR (CDCI3, 300 MHz): 6.48 (s, H-15), 2.33 (s, 
16-Me), 2.21 (s, 12-NMe), 1.88 (s, 6-Me), 1.42 (t, J" = 6.9 Hz, Me-Etoxy), 
1.34 (t, J= 6.9 Hz, Me-Etoxy), 0.8 (d, J= 6.9 Hz, H-26) 
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Example 7. Enzymatic transformation of Safracin B into Safracin A 

In order to assay the enzymatic activity of conversion of safracin B 
into safracin A, a 120 hours fermentation cxiltures (see conditions in 
Example.2. BioZogffcaZ assay (biotest) for safracin production) of different 
strains were collected and centrifuged (9.000 rpm x 20 min,). The strains 
assayed were P. fluorescens A2-2, as wild type strain, and P.fluorescens 
CECT378 + pBHPT3 (PM- 19-006), as heterologous expression host. 
Supernatant were discarded and cells were washed (NaCl 0.9 %) twice and 
resuspended in 60 ml phosphate buffer 100 mM pH 7.2. 20 ml from the 
cell suspension was distributed into three Erlenmeyer flask: 

A. Cell suspension + Safracin B (400 mg/L) 

B. Cell suspension heated at 100 °C during 10 min. + Safracin B 
(400 mg/L) (negative control) 

C. Cell suspension without Safracin B (negative control) 

The biochemical reaction was incubated at 27 °C at 220 rpm and 
samples were taken every 10 min. Transformation of safracin B into 
safracin A was followed by HPLC. The results clearly demonstrated that 
the gene cloned in pBHPTS, sacH, codes for a protein responsible for the 
transformation of safracin B into ssifracin A. 

Based on this results we did an assay to find out if this same 
enzyme was able to recognize a different substrate such as ecteinascidin 
743 (ET-743) and transform this compoimd into Et-745 (with the C-21 
hydroxy missing). The experiment above was repeated to obtain 
Erlenmeyer flasks containing: 

A. Cell suspension + ET-743 (567 mg/L aprox.) 
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B. Cell suspension heated at 100 "C dTiring 10 min. + ET-743(567 
mg/L) (negative control) 

C. Cell suspension without ET-743 (negative control) 

The biochemical reaction was incubated at 27 °C at 220 rpm and 
samples were taken at o, 10 min, Ih, 2h, 3h, 4h, 20h, 40h, 44h, 48h. 
Transformation of ET-743 into ET-745 was followed by HPLC. The results 
clearly demonstrated that the gene cloned in pBHPT3, socfiT, codes for a 
protein responsible for the transformation of Et-743 into Et-745. This 
demonstrates that this enzymes recognizes ecteinascidin as substrate and 
that it can be used in the biotransformation of a broad range of structures. 



SEQUENCES 

<110> PharmaMar 

<120> The gene cluster involved in safracin biosynthesis and its uses for 
genetic engineering 
<130> aaa 
<160> 15 

< 170> Patentin version 3. 1 

<210> SEQ ID 1 
<211> Lenght: 26705 
<212> TypeDNA 

<213> OrganismrPseudomonas fluorescens A2-2 

<220> Feature: 

<221> NAME/KEY: sacB 

<222> LOCATION: (6080).. (9268) 

<223> OTHER INFORMATION: non ribosomal peptide synthetase gene 



<220> 

<221> sacC 

<222> (9275).. (13570) 

<223> non ribosomal peptide sjmthetase gene 
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<220> 

<221> sacD 

<222> (13602)..(14651) 

<223> Hypothetical protein gene 

<220> 

<221> sacE 

<222> (14719)..(14901) 

<223> Hypothetical protein gene 

<220> 

<221> sacF 

<222> (14962). .(16026) 

<223> methyl-transferase protein gene 

<220> 

<221> sacG 

<222> (16115)..(17155) 

<223> methyl-transferase protein gene 



<220> 

<221> sacH 

<222> (17244). .(17783) 

<223> h3^othetical protein gene 

<220> 

<221> sad 

<222> (1854),.(2513) 

<223> Complementary, methyl-transferase protein gene 

<220> 
<221> sacJ 
<222> (335).. (1861) 

<223> Complementary, mono-oxygenase protein gene 

<220> 
<221> orfl 

<222> (18322). .(19365) 

<223> aminoacid peptidase-llke protein 



<220> 
<221> orf2 
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<222> (21169)..(22885) 

<223> complementary; hox-like regulator protein 

<220> 

<221> orf3 

<222> (23041). .(23730) 

<223> complementary; glycosil transferase-like protein 



<220> 
<221> orf4 

<222> (25037).. (26095) 

<223> isochorismatase-like protein 



<400> SEQUENCE 1 
ctgcaggtgg tttgcgcgcg 


gaagacccgc 


cactgccggt 


gcgctcgttt 


gaattgcaca 


60 


tggcgtggcg 


tgggtcgcag 


gauaauganc 


cgggggagcg gtggttgcgg tcgcggattc 


120 


agatgttttt 


tggcgacccc 


ganagccuCu 


aattaaactc 


cactaaaatc 


ggcgattgca 


180 


gagcctgagt 


acaacacggc 


tiactggactg 


aagtgggcgc 


atcgtgccgc 


atagccatag 


240 


tgatctcggt 


gtgtctcgcc 


atgtcccggc 


ccaggtcgta ggtcatgctc 


ttgcgcattg 


300 


ccagcatctt 


cgtgctccct 


tgccagctgt 


ttcaggtcag 


gctctgacgc 


gcggatttag 


360 


aatcgtccag 


cagccactca 


cccaagcgct 


ccttggccaa 


ggtcgatttt 


ttaccgaccc 


420 


agatcaccac 


gccatccggc 


ctgacgatga 


ctccctcgcc 


agcggacaag 


ctgttattgt 


480 


gcgaatcctc 


gcagatagat 


gccgtttgca 


accccggaaa 


atcacgtcga 


agatcggcgg 


540 


ccagtgcttt 


ggcacgatga 


tgtaccagga 


caaaccgccc tgctcgcagc aactgggtca 


600 


agctctgtcg 


tggtaatcgc 


tcaccctcgg 


gaagcaggct 


caacaggggt 


aaacgtcggc 


660 


ccaccaagcg 


atgatcgcct 


cggcgccgca 


cactgtcata 


gcgcacgccc 


tctcccgcca 


720 


gtgcactgac 


caccttctgc 


gccacatacg 


gggcccgtgt 


cgcctgcaag 


ccgatccagt 


780 


gaatcagacg 


gcctataggg 


ccggaagccg 


tattgaatcg gaaaagcaga tccgtgttgc 


840 


gcaaggctgc 


cgccgcaata 


ggcctacgct 


cggcctcgta 


act:ctccaga 


agatccatcg 


900 


gcaatgtggc 


ctgtatcaca 


cccgccagct 


tccaggcgag 


gttcgccgcg 


tcaccgatcc 


960 


ccatctgcaa 


accttgcccc 


ccggcgggga 


cgtgggtgtg 


agcagcatct 


cccagcagaa 


1020 


ataccctccc 


ctggcgataa 


tgagtcgcca 


ggcgctgctg 


gctgcggtaa 


cgagcgctcc 


1080 
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acagcacctg cgccaatccg aaatcggttc ccagaatatc tttcatccct ccggcaattt 1140 

cctcgtgggt gaccggctgt ttgaccggag tatccatacg ttcgttgtct tcgatactga 1200 

cgcgataact gccatcgggt aatggaaaca gggcaaccag acccctggag accgaccttg 1260 

catggactgc aggtgaaggc ggatttctca agacaacgtc cgccaccacc aacgaatgct 1320 

tgtagtcctg cccgacaaac gaaatattga ggagttggcg gacagaactg ttgaccccat 1380 

cagcccccag cacccagtcg tagcggcttt gctgcacgct gccggtctcg ctgtgttcca 1440 

gggttacctc aacgtgtaaa tcgccggcat ccagagcctt tagcgcatac ccacgcttca 1500 

gattcacccc cttgcgattg acccaatcag tcaacaccga ctcggtctga gactgtggga 1560 

tgatcaccat gtggggatac tcacaaggga gtttggaaaa tgagagcgtt cggcccgcct 1620 

tgtcacccaa cggcgcgctt gcccagacga tcccccgacg tatcatctca tctgccacgc 1680 

cccaggcatt gagcaactcc agggtcacgg gctccaagcc aaaggcccgg gaatggggag 1740 

aggcagccgg tcttttatcg atgagatcaa ccgctatacc caactcggcc agagctgcag 1800 

ccacagccaa cccgacgggc cccgccccca cgaccaggac ctgtttattt ttaacgacca 1860 

tgatcaccca cctctccaca gcaggggcgg aacgtggcga caatgacgtg ctggtcggtg 1920 

acgctcgttt ccatgctctg cagcccggcc gtttgcgcga gctggattac ctcatgctct 1980 

gatcgataat ctgccagtga gcccagcagc cagttgatca atgtggtcaa tccccagccc 2040 

gcaggcaggt tctccaccag gcagaaaagg ccctgaggct tgagcacgcg acggacctca 2100 

ttcaggctga cagacttgtc agcccaatgg ccgaacgaca tcgagcacac caccagatcc 2160 

atgctttgcg aggggaatgg caaggcttcg gcaactcctt tgacgaacga ggcgaaggga 2220 

cgtcgtttgg cggcctcgtc gaccatgccc tgagccgggt cgacgccttc gaagcgcgcc 2280 

tcaggccaca gagcgaacat gcgttcgatc aatgcgccgg taccacaacc gatgtccaga 2340 

acacgctccg gtctcgaggt gcacatccat cggctcaaca ttcgcagaca gtcgtcatgg 2400 

gcctggctca gtttggtacc gtacttctct tcatacgtcg gcgcgatacg gctgaacgtc 2460 

cgcacaaaac ctggatttcc attgcctttc ccgccagaaa atacgttaga cattattgaa 2520 

catccatata tcaacagtta tccgccaagg accatagtag agaaaatcca tcccatccaa 2580 

ataaaaatta aataagtggg gctaaccgca atccagggaa actctgaaaa ggcccgctac 2640 

ttgtcgacgc ggctgtctgg aggccgcata gttactgaac ttactattaa aagactgggc 2700 

tttttcagag ccccaccgga tgttggctcc ttgtccatca tttcgggggc actgtaacat 2760 

tctgttacct ggctatcgct tgacttttaa tctgaacggg caattatagg tctaaccgca 2820 
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acagccccac ggcgcttaag ttcgaaaaaa gtagctgcac cttgctcaac tgcatcttgt 2880 

aataaggggc actttacaag ccgcataaga cataaatttt atgctgactc cccaaacaag 2940 

cacgacaagt aaaaaacact tgtccaatac caaggagggt atacggtgca agcttcttta 3000 

cgtcaaaaag ttctctgctt acagcagtct atcgatccca gccagccagg catgttactt 3060 

gaagtcgctt ttcatgtgat cacccatctt tctacttcgc agttggtatc gcgtatcgag 3120 

agagtggtcg agcgacatgc gtctttacgg cagcgctttg tcatgcgcaa tggcacttac 3180 

tggattgaac aagccccacc gcaacaacga cgctactgcg tggtacgcac ctatgatgaa 3240 

gcatcgaccg atgcactgct ggcgccgagc cgcgagcaca tcggggttga gtctgagcgt 3300 

ttgttccgcg ccgaagtcgt tgagcgcagc gacggacaac gctacttggt cttccgaatt 3360 

catcacatca tcgccgacct gtggtctgtc ggcctcctga ttcgagactt tgccgaagac 342 0 

tgtatggacc gctccagcat caccctggcg tcaagaccga ttgccccgtt gatcgaccct 3480 

gagttctggc ggcaccaaat gtcacaggac actccgtttt ccttgcccat ggcctccctg 3540 

gaacagcaca cggaccgccg catggtgctg tcttcgttcg ttattgatca ggagagcagc 3600 

gctgacctgg cccgcctggc cacagcctgc gcggtaaccc cgtacaccgt aatgctcgcc 3660 

gcacaagtat tggcgctgtc cagaatcggc cagagtggcc gtctgtcact tgcggtgacg 3 72 0 

ttccatggcc gcaacagggg caacaaggat gcggtaggtt acttcgccaa tacgcttgcc 3780 

gtgcctttcg atgtcagcga atgcagcgtg ggcgagtttg tcaaacgcac cgccaagcgc 3840 

ctggatgagg cctcaaaagc cagcgtcggt gccggttatc ccgaattggc agagttcatg 3900 

acgccgctgg gatgggctgc gaccgccccg accaatgcgg tgatttacca gcaggatatg 3960 

ccaggcatgc caagaggatt ggcggcggct ctgctgggat tgggcacggt gcagttgggc 402 0 

gagatggcgc tgaccgcgga acaggcaccg cccagcatcg gcccgtttgc cactgcgctg 4080 

ctgctgacgc gccacgacgg caagctgcat ggccgggtcg aggtcgatcc tgcgcagcat 4140 

cccggttggc tggcagaggc gttagccaga cagttcgctg tgatcctgcg ggaaatggtg 4200 

cgtgatccac aggccagact gtcagccttg ccagcgtgcc tgttacacca accaaaatac 4260 

ccgagccaag cgcggccggc gcctgcgtca gaaacattgg tagccacctt tctccggcaa 4320 

gtcgccatca cgccggacaa gcccgcgctg cgtacgccgc aggccagcat cagctatagc 43 80 

gaattggcca gtcgagtcgc caggctctcg gcagccttgc gcgtacgcgg cttcaaacct 4440 

gaacagaccc tggcaatact cctgcctcgc gatatcaatc tggtacccgc tctgctggcg 45 OO 
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atcatggcct 


gcggtggcag 


ttatgtgcca 


cgttcgattc 


tgaccagggc 


ccgttgccgc 


cgtttcgctc 


acttggcgcc 


ctgctggtcc 


ccgctgcagg 


accagtccaa 


gcttcaagcc 


accggtgaac 


caaaaggcgt 


ggcgatcacc 


gcggctctcg 


attgtggccc 


cgagtacctg 


ttcgatcttt 


cgattttcga 


gatgtttgct 


gtttcctcgg 


tcatggcgct 


gatcgacaat 


aatacggtgc 


cgtcggtggc 


cgacgctttg 


cgcatgctca 


acctcgcggg 


agaacccctg 


aaactgaccg 


ccacacgcat 


cgtcaacctc 


accgccctgg 


tgatcgagcc 


cgcacaacaa 


acctgggtgg 


atgtcgttga 


tcaaaacatg 


ttgatcattc 


atggacacgg 


cgtggcgcaa 


gcttctttcc 


tgccggcatc 


cgatggcttg 


tggttgcccg 


atggccgcct 


ggactttatc 


ggtttccggg 


tcgagttggg 


gcctgttcag 


gaatccgcag 


tagtcgttgt 


gccgaaaggg 


ctcaaagcgc 


cgagcgaaga 


tgaagcggtg 


ggcgtactcc 


cctattacgc 


actaccggac 


aacacacatg 


gaaaaatcga 


cagaacgctg 


gaaagcgcca 


tgcgagatgc 


gaccgacgtc 


atcatcggac 


accccgtcca 


actccacgaa 


tcgcttacgc 


atttaacggg 


cctactgaga 


gacctctgga 


tcaggccaac 


catagaacaa 


tcggtattga 


caaaacctgc 


cgccgcgcca 


cattaatcag 


gagtaccgca 


tgagcgtcga 


atacggccag 


gaacagatct 


ggtttctgaa 


taccctggcg 


atgaaagtat 


ctatcgccgg 
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ctcagtgacg 


cgaaccccgc 


cgaactcaac 


4560 


gcgattctca 


cggatcagga 


gggtttgacc 


4620 


ttgagcgacc 


tgctgtcgat 


gcccgacgcc 


4680 


aaggcctata 


tcctatttac 


cagcggctcc 


4740 


catgctaatg 


ccgccaacct 


gctgcgttgg 


4800 


gcgcaaacac 


tggcggcaac 


ccccactacg 


4860 


ccccttatgg 


tcggtggctg 


cgtacagccc 


4920 


ccggccctgc 


taaagggcac 


aacactgatc 


4960 


ttgcagcatg 


atgtactggt 


gccttccttg 


5040 


aaccgggatc 


tttacctgcg 


gcttcaggca 


5100 


tacggcccga 


cggaaacaac 


aacctattcc 


5160 


gagatcacca 


tcggttttcc 


actgtatggc 


5220 


caaagcgtcg 


gtatcggtgt 


acctggcgag 


5280 


ggctatgtca 


gcgaccccgt 


gcgtagcgcc 


5340 


cgttgctacc 


gcacgggaga 


ccgtgtccgc 


5400 


ggtcgagagg 
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ggtcaaccaa gtggtggcct cccaggaaat tttgagaaca tcattcgcct ataaaaacca 6300 
gaagttgagc caggtcattt caccctccgc gacactgccc attcgcagcg cgcactgcat 6360 

tgacgatgta cctgggctgc aacgcctgat caacatggaa gcccagcgtg gctggtcgct 6420 

gagcagcgcg ccactgtacc gcttgctgct gataaaaacc ggcgaccagc aacatgagct 6480 

ggtcatctgc acccaccata tcgtctgcga tggcatctcg ctgcaactgc tgctgcaaaa 6540 

aatagtcagc gcctatcaag gccaaagcga tgggcgggtg ctcacaagtc cggatgaaga 6600 

gaccctgcaa ttcgtcgatt atgcggcctg gtcaaggcag cacgaatatg ccggtctcga 6660 

gtactggcgc cagcaactgg ccgacgcccc gacaatcctg gatatttcga caaaaaccgg 6720 

ccgaagtgag caacagacat ttctcggcgc gcgaattccc gtcgagttca gccaccacca 6780 

atggcaagca ttgcgccaga tattcagacc ccagggtatc tcctgcgcgg cggtgttcct 6840 

ggcggcctac tgcgtcgtcc tgcaccgcct ggccgagcag gacgacattc tgatcgggct 6900 

gccaacttca aatcgcctgc gtccggagtt ggcacaggtg atcggctacc tgtccaatct 6960 

gtgcgtgttt cgcagccagt atgctcacga ccagagcgtc acagactttc ttcaacaggt 7020 

tcaattgacc ttacccaact tgatcgagca oggggagacg cctttccagc aagtactgga 7080 

aagtgttgag cataoccggc aagccggtgt gacgccgttg tgccaagtac tgtttggtta 7140 

tgagcaggac gttcgacgca cgctggatat cggcgacctg caattgacgg tctcggatgt 7200 

ggacacgggg gccgcacgcc tggatctatc gctgttcttg ttcgaggacc acgaactcaa 7260 

cgtttgcggt tttctggaat atgccacgga ccgtatcgac gccgcatctg cgcaaaacat 7320 

ggtgcgcatg ctcagcagcg tgctacgcga gttcgttgcg gogccgcagg cgccgctcag 7380 

cgaagtacag ctgggggcgg cggattccca agcccagaca cctgcgatcg caccagcatt 7440 

cocaagcgtg ccggctcgtc tgttcgcctt ggcagacagt caccccaatg cgaccgcgct 7500 

gcgtgacgag caaggtgaac tgacctatgc gcaagtttgc caacagattc tgcaggcagc 7560 

ggccactotg cgagcccagg gggcgaaacc tggaaccctg atcgcggtca tcggcgagcg 7620 

cggtaacccc tggttgatcg ccatgttggc gatctggcaa gtcggcggta tctatgtgcc 7680 

attgtocaag gacctgcccg aacagcgcct gcaaggcatc ctggcggaac tcgaaggggc 7740 

catactgatt accgacgaca ccacgccgga acgcttccgg caacgtgtga cgctgcccat 7800 

gcacgcctta tgggccgatg gcgcaacgca tcacgagcgg oagacgacgg acgccagccg 7860 

gctgtctggc tacatgatgt acacctcggg atcgaccggt aaaccgaaag gcgtgcatgt 7920 
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cagccaggcc aacctggtcg cgaccctgag cgcattcggc cagctgctgc aggtgaaacc 7980 

cagcgatcgg atgctcgcac tgacgacctt ctccttcgac atttcgctgc tcgagctgct 8040 

gcttcccctg gtccagggcg ccagcgtgca aatcgctgtc gcacaggctc aacgcgatgc 8100 

ggaaaagctc gcgggctatc tcgcagaccc tcggatcacg cttgttcagg ocacaccggt 8160 

gacctggaga ctattactgt cgacaggctg gcagccacgg gaaagcctga ccctgctgtg 8220 

cggtggcgaa gcgctgccac aggatctggo ggacaggttg tgcttgccgg gcatgacctt 8280 

gtggaacctc tacggcccca ccgaaacaac aatctggtcc acggcctgcc gcctgcaacc 3340 

gggtgcgccg gtgcaactgg gccatcccat tgcaggtacg caaatagccc tggtggatcg 8400 

gaacctgcgc agcgtgccca gaggggttat cggtgaactg ctgatttgcg gccccggcgt 8460 

cagccagggc tactatcgca acccggttga aacagccaag cggttcgtac cggacccgca 8520 

tggttcaggt aagcgcgcct atctgaccgg cgaccggatg ogcatgcagc aggatggttc 8580 
gctggcctat atcggccgac gtgacgacca gatcaagctg cgcggccacc gtatcgagct 
gggagagatc gagacagcgt tgcgaaaact gcccggcgta cgggatgctg ccgcccaact 

ccatgaccag gacccaagtc gaggcataca ggcctttgtc cagctttgcg caacggtcga 8760 
tgagagcctc atcgatatag gccagtggct ggaaacactg cgocaaacgc tgcctgaggc 
gtggctgcct actgagtatt acaggatcga tggcatccct cttacctaca acggcaaacg 
cgaoaggaag cgcctcctgc accaggccgt caggctgcaa acactcagtc tgagggtggc 
tcccagcagt gacaccgaga ccogggtgca gcagatctgg tgcgagctgc tcggtctcga 
ggatatcggc gttacggatg attttttcca gttaggcggc cactccattc tggtggcgcg 

catggtcgag cgcatcgaaa ccgcgtttgg acggcgcgta cctatcgcag atatctattt 9120 

ttctccgacg atcgcccgtg tggcggcgac gotggactcc atgacatttg aacaaggact 9180 

ggccgcacac agcgtgaaag gcgattggga gttcaccgcc atcagccttc aacacaacgc 9240 

cgacagcaca gccgccgctc aggagagatg aatcatgcac agccccacta tcgatacttt 9300 

cgaggccgca ctgcgctcat tgcccgctgo ccgcgacgca cttggtgcct atcccttgtc 9360 

cagcgaacaa aagcgcctct ggttactggc ccaactggcg ggcacggcaa cgttgccggt 9420 

aacggtgcgt tatgcattca ccggcacggt ggaccttgct gtcgtgcagc agaacctgag 9480 

cgcgtggatc gcacacagcg agtccttacg cagccttttc gtcgaagtac tggaacgccc 9540 

cgtcaggctt ctgatgocta cgggcctggt gaaactggag tacttcgatc gcccgccatc 9600 
cgatgccgat atggccgagc tcataggcgc cgcctttgaa ctcgacaaag ggccgttgct 
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gcgtgcgttc atcactcgaa ccgctgcaca acagcatgaa ttgcatctgg tcggccatcc 9720 
tattgtcgtg gacgaacctt ccctgcagcg cattgcccaa accctcttcc agaccgaacc 9780 
cgatcatcag taccccgccg tcggtgcgat cgccgaggta ttccagcgcg aacagacact 9840 
ggcacaggat gcgcaaatca ccgaacaatg gcagcaatgg ggaataggcc ttcaggcgcc 9900 
tgcggcaacc gaaattccga ccgaaaaccc ccgccccgct atcaagggct cagatcgtca 9960 
agtacatgaa gcccttactg catggggcga ccaacccgta gcagaggccg aaattgtcag 10020 
cagttggctg accgtgctga tgcgctggca gggatcgcaa- tcggcgcttt gcgcaatcaa 10080 
ggtgcgcgac aaggcgcatg ccaacttgat cggcccactg caaacctacc tgccggtccg 10140 
cgttgatatg ccggatggca goaccctggc acaactgcga ctccaggtgg aggaacagct 10200 
caatggcaac gaccatccgt ccttttccac gctgctggaa gtttgcccac caaagcggga 10260 
cctgagtcgc accccctact tccaaaccgg cctgcagttc attgcgcacg atgttgaaca 10320 
gcgcgacttc catgccggca acttgaoacg cctgocaacg aagcagccaa gcagcgacct 10380 
tgacctgttc atttcctgct gggtaagcga cggcacgctt ggcctgacgc tggattatga 10440 
ttgcgccgtg ctgaattcga gccaggtcga ggttctggcc caggcgctca tcagcgtatt 10500 
gtcagcgccc ggtgaacagc caatcgcaac cgttgcgctg atgggccagc aaatgcagca 10560 
aaccgtcctg gctcaggccc acggcccccg cacgacgccg ccgcaactga cactgaccga 10620 
atgggtcgcc gccagcacgg aaaaatcccc gctggcggtt gcggtgatcg accacggcca 10680 
gcagctcagc tatgcagagt tatgggcaag agctgcactg gtagcggcga acatcagcca 10740 
gcatgtggca aagcctcgga gcatcatcgc tgtagcactg cccagatcgg ctgaatttat 10800 
tgcagcgctg ctgggggtag tgcgagcagg tcatgcgttc ttgcccatcg atccccgcct 10860 
gcccaccgac cgcatccagt tcctgattga aaacagtggc tgtgagttgg tcattacctc 10920 
tgatcagcaa tccgtggagg gttggccgca ggtcgccagg atacgaatgg aggcgcttga 10980 
tccagacatt cgctgggtgg cgccgacggg gctcagccac agcgatgccg cctacctgat 11040 
ctatacctcc ggcagcaccg gcgttccgaa gggagtcgtt gtcgagcacc ggcaagtagt llioo 
gaataacatc ttgtggcggc aacgaacctg gcogctgaog gcacaggaca acgtgctgca llieo 
taaccattcg ttcagcttcg atcccagcgt ctgggcgttg ttctggccgc tgctgaccgg 11220 
tggcaccata gtgctggcgg atgtcagaac catggaggac agcaccgccc tcctcgacct 11280 
gatgatcogc catgatgtca gcgttctggg tggcgtaccg agccttctcg gtacgctgat 11340 
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cgatcatcca ttcgccaatg attgccgggc ggtcaagctg gtgctcagtg gcggcgaagt 
cctcaacccc gaactggcac acaaaattca aaaggtctgg caggccgacg tcgccaacct 
ctatggccct accgaagcga ccatcgatgc gctgtatttt tcgatcgaca aaaatgctgc 
cggcgccatc ccgattggct atccaatcga caataccgac gcttatatcg tcgacctcaa 
tctcaaccca gtcccgccag gcgttccggg agaaatcatg cttgctggcc agaaccttgc 
gcgcggctat ttgggcaaac ctgcgcaaac cgcgcagcgc ttcctgccca acccatttgg 
caacggacgc gtgtatgcaa cgggcgatct gggacgacgc tggtcatcgg gggccatcag 
ctacctgggc cgacgcgacc aacaggtgaa gattcgcggg catcgcattg agcttaacga 
agtcgctcat ctgttgtgcc aggcgcttga gctgaaggaa gccatcgtct tcgcccagca 
cgctggaacc gaacaggcac gcctggtggc ggccatcgag caacagccag gcctgcacag 
tgaaggtatc aaacaggaat tgctgcgcca cttgccagcc tatctgatcc ctagccagct 
cctgctattg gatgaactgc caagaaccgc caccggcaag gtcgacatgc tcaagcttga 
tcagttggca gcccctcagc tcaatgacgc cgggggcacg gaatgccgtg cgccacgtac 
cgaccttgaa caatcggtca tgacggattt cgcccaagta ctcggcctca ctgcggtaac 
gccggacacg gatttcttcg agcaaggcgg caactcgatt ctactcacgc gcctggcagg 
caccttgtct gccaaatacc aggtgcagat tccactgcat gagtttttcc tgactccgac 
cccggcagcg gtggcgcagg caattgaaat ctaccgtcgc gaaggcctca cggcactcct 
gtcacgccag catgcacaaa cgctggagca ggacatctac ctggaagaac acattcggcc 
ggatggctta ccacatgcca actggtacca gccttctgtc gtgtttctga ccggagccac 
cggctacctg ggactgtacc tgatcgaaca gttgctcaag cgcaccacca gccgcgtcat 
ctgcctgtgc cgtgcaaagg atgccgagca tgccaaggcc aggattctgg aaggcctgaa 
aacctaccgc atcgacgtag gcagcgaact gyaccgggtg gagtacctca cgggcgacct 
ggcgttgccg cacctgggcc tgagcgagca tcaatggcaa acgctggccg aagaggtcga 
tgtgatttat cacaacggcg ccttggtcaa ctttgtctac ccctacagcg cactcaaggc 
gaccaacgtg ggaggcacgc aggccattct ggaattggcc tgcaccgctc gactcaagag 
tgttcagtat gtctccaccg tggatacgct cctggcgacg catgtccccc gcccttttat 
cgaggacgat gcccccctgc gttccgccgt cggcgtacca gtgggctaca caggcagcaa 
gtgggtggca gaaggggtgg ccaatcttgg cctgcgtcgc ggcattccgg tcagcatctt 
ccgcccgggc ttgatcctgg gccataccga aacgggtgcc tcgcagagca tcgactacct 
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gctggtggcg ctacggggtt tcttgcccat gggcatcgtg ccggattacc cacgcatctt 13140 
cgacatcgtg cccgtggact aygtcgccgc ggcgatcgtg cacatatcga tgcaaccgca 13200 
gggcagggac aaattcttcc acctgttcaa cccggcgccg gtcaccatcc gccagttctg 13260 
cgactggatt cgcgaattcg gttacgagtt caagttggtc gacttcgaac acggtcggca 13320 
gcaggcattg agcgtaccgc ccgggcacct gctgtacccg ttggtccccc tgatcaggga 13380 
tgccgatccg ctgccccacc gcgcgctgga ccctgactac atcoatgaag tgaaccccgc 13440 
actggaatgc aagcaaacct tagagctgct ggcctcctcg gacatcacce tgtcgaaaac 13500 
cacaaaggct tacgcgcaca caattttgcg ctacctgata gacaccggct tcatggcoaa 13560 
gcctggcgtg tagcggattg agcacaaaca ggacgaatat catggaatcg atagcctttc 13620 
ccattgcaca taagcccttc atcctgggct gtccggaaaa cctgccggcc accgagcggg 13680 
cgcttgcccc ttctgcggcg atggcgcggc aggthttgga gtacctcgaa gcgtgccccc 13740 
aggcgaaaaa cctcgagcag tacctcggga cgotgcgtga agtcctggcg cacctgcctt 13800 
gtgcttccac cggactgatg accgatgatc cacgggaaaa ccaggaaaac cgcgacaacg 13860 
atttcgcctt cggtattgaa cgacaccagg gcgacactgt gaccctgatg gtcaaggcca 13920 
cccttgatgc agcgattcaa acgggcgagt tggtccaacg cagcggcact agcctggatc 13 980 
actcggagtg gagcgacatg atgtcagtcg caoaggtgat tctgcagacg attgccgacc 14040 
ctcgggttat gcccgaatcc cgtttgacgt tocaggcacc gaaaagcaag gtcgaagaag 14100 
atgaccagga cccgctgcga cgctgggtgc gtggccacct gctgttcatg gtcctgtgcc 14160 
aaggcatgag cctgtgtacc aacotcctga tcagcgcggc ccacgacaag gacctcgaac 14220 
tggcgtgtgo acaggccaat cgcctgattc aactgatgaa catctcgcgc atcacgcttg 14280 
agtttgcaac cgacctgaac tcacaacagt acgtcagcca gattcgcccg acgctcatgc 14340 
cgscgatcgc gccgcccaag atgagtggca tcaactggcg tgaccatgtg gtgatgattc 14400 
gttggatgcg ccagtccaco gatgcctgga acttcattga gcaggcctac cctcaactgg 14460 
ctgaacgtat gcgaaccaca ttggcgcagg tctacagcgc tcatcggggg gtctgcgaaa 14520 
agttcgtagg cgaagaaaac accagtttgt tggccaagga aaacgocact aatacggccg 14580 
gccaggtgtt ggaaaacctg aagaaatcga gattgaaata cctc^agaca aaaggttgcg 14640 
ccggtgcggg ataagccctg actgcgcotc gcccccatca aaaccggact gatattcggg 14700 
aaaacaaagg agagaagcat gccgacattt ctgggagacg acgacgcagt gccatgcgtg 14760 
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gtcgtcgtta acgccgacaa acactattcg atttggccaa gcgcgagaga cattccatca 14820 
ggttggtccg aagaaggatt caaggggtca cgttcagact gcttggaaca tatcgcgcaa 14880 
atctggccag agccgacggc atagatacaa cgtgatgcaa aaaatgcggg aaacatcaac 14940 
taaccaaagc aaggaagaaa aatgacttca actcatcgca ccactgatca agtcaagcct 15000 
gctgttctgg atatgccagg cctgtcgggc attcttttcg gccacgccgc attccaatac 15060 
ctgcgggcca gctgcgaatt ggatctgttc gagcatgtcc gcgacctgcg cgaagccacc 15120 
aaggagagca tcagcagccg actgaagttg caggaacgcg ccgccgatat tctgctgctg 15180 
ggcgcgacct ccctgggcat gctggtcaag gaaaacggca tctaccgcaa tgccgatgtg 15240 
gttgaggatt tgatggccac ggacgactgg caacgtttca aggataccgt ggcctttgaa 15300 
aactatatcg tctatgaagg gcagctggac tttaccgagt ccctgcagaa aaacactaac 15360 
gtcggccttc agcgtttccc gggcgaaggg cgggacctct atcaccgcct gcaccagaat 15420 

cctaagctgg aaaacgtgtt ctaccgctac atgcgctcgt ggtctgaact ggccaaccag 15480 

gacctggtca agcacctcga cctgtcgcgc gtgaaaaaat tgctcgacgc gggtggcggt 15540 

gatgcggtca acgccatcgc cctggccaaa cacaatgagc aactgaacgt aacggtactg 15600 

gatatcgaca actccattcc ggtcactcag ggcaaaatca atgattccgg gctcagccac 15660 

cgggtgaaag cccaggcatt ggatatcctg caccaatcct tccctgaagg ttacgactgc 15720 

attctcttcg cccaccaatt ggtgatatgg accctcgaag aaaacaccca catgctgcgc 15780 

aaggcctacg atgcgctacc agaaggcgga cgcgtggtca tcttcaactc catgtccaac 15840 

gatgaaggcg acggcccggt catggccgca ctggacagcg tctactttgc ctgtctaccc 15900 

gccgagggcg gcatgatcta ttcctggaaa cagtatgagg tctgcctggc ggaagccggc 15960 

ttcaaaaacc ccgtacgcac cgcgattcca ggctggaccc cacacggcat catcgtggcc 16020 

tacaagtaat tttgcctcct ccgcccctac tggggccgga ggagtcattt caacatttgc 16080 

gtcattgacg ccacctggcg atagggacac ccacatggca cgttcacccg agacaaatag 16140 

tgcgatgccg caacagataa gacagctttt atacagccaa ctgatttcgc aatcgattca 162 00 

aaccttctgt gaactgcgcc tgcctgatgt tctgcaagca gctggccagc ctacctccat 16260 

cgaacggctt gctgagcaga cacacactca tatcagcgcc ctgtcacgct tgttgaaagc 16320 

gttgaaacca ttcgggctag tgaaagaaac cgacgaaggt ttttccttga ccgatctcgg 16380 

cgccagtctg acccacgacg cctttgcttc cgctcaaccc agtgctttgt tgatcaatgg 16440 

tgaaatgggc caagcctggc gtggcatggc gcagacaatc cgaaccggtg aatccagctt 16500 
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caagatgtac tatggcatca gcctgttcga gtattttgaa cagcacccgg aacgccgggc 16560 
catttttgac cgttcccaag acatgggact ggacctggag atcccggaaa tcctggagaa 16620 
catcaacctg aatgacggtg agaacattgt cgatgtaggg ggtggttcag ggcatttgct 16680 
gatgcacatg ctggacaagt ggccagaaag cacaggcata cttttcgact tacccgtcgc 16740 
ggcaaaaatc gcgcagcaac atctgcacaa atctggaaaa gcaggctgct ttgaaatcgt 16800 
cgcaggggat tttttcaaga gcctgccgga cagtggcagc gtttacttgc tgtcccatgt 16860 
cttgcaogac tggggcgacg aagactgcaa ggccattttg gccacctgcc ggcggagcat 16920 
gccggacaat gcgctgttgg ttgtagtgga cttggtgatt gaccagagtg aaagtgccca 16980 
gcccaacccc acgggcgcaa tgatggatct ttacatgctg tccttgttcg gtatcgccgg 17040 
aggcaaagag cgcaacgagg atgaattcag aaccctcatt gaaaacagcg gcttcaacgt 17100 
caaacaggtg aagcgcctgc caagtggaaa cggcatcatc ttcgcctacc caaaataaat 17160 
gatcctcatt gcccctcgcc actttccagg ggggctattt tattctcggg tgattccccc 17220 
cctaatgatt acaaggaaga cacatgtcga cgctggttta ctacgtagca gcaaccctgg 17280 
atggttatat cgcoactcaa caacacaaac tggattggct ggagaacttt gccctggggg 17340 
atgacgcaac ggcctatgay gatttttatc agacgatcgg agcagtggtc atgggatcgc 17400 
agacotatga atggatcatg tcgaacgctc ccgatgactg gccctaccag gacgtacccg 17460 
cctttgtcat gagcaaccgg gatctgtcag cccccgccaa tttggatatc accttcttac 17520 
gcggcgatgc cagtgccatc gcggtcaggg ccaggcaagc ggcgaagggc aagaatgtct 17580 
ggctggtcgg tggcggcaaa acggcggcct gttttgccaa cgcaggggaa ttacagcagc 17640 
tgttcatcac cactattcca acctttatcg gcaccggcgt tccggtactg cccgtagacc 17700 
gcgcgcttga agtggttctc agagaacaac gcacgctgca gagcggtgcc atggaatgca 17760 
tcctggacgt gaaaaaagcg gattaacgtc tacaagacaa tcgtgtatcg aaactcgcaa 17820 
cgtccaaacc caagggaaaa accagtgaag cgattggtat tgagtttatg tttgttggct 17880 
gttatcgctc tcgccagtgt tcaaggaata aggatggtga aacccgccgc cctgacagcc 17940 
gccgatgctc gcgatatcgg ctatctgaat gtacgcgata gcctttccgt cattgccgcc 18000 
gccccccacc ccaccgcctc acctcgccag gccgttgtca ggcattattt gcgggaaacg 18060 
attgcgggca tgggttacca ggtggttgag caaccctttc tatttaccat cgagagcatg 18120 
gtgaaccggc agaaaaccct ctatgccgag ttgaacgagc agcagcgcca agcgttcgat 18180 
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gctgagctgg cccgggtggg cgcggacagt tttgaaaaag aagtgcggat tcgctccggc 18240 
ctactggaag gcgacagcgg ccagggaacc aacttgatag cctcacaccg cgtaccggga 18300 
gcgaccgcga cggtcctgtt catggcgcat tacgacagcg tcggcaccgc tcccggtgcc 18360 
agtgacgatg gcatggccgt cgcctcgata ctccaactga tgcgggaaac cataacccgc 18420 
agcgatgcca aaaataacgt tgtctttcta ctcrccgatg gcgaagaact gggcttgctc 18480 
ggagcggagc actacgtctc gcagctcagt acgcctgaac gtgaagccat ccgcctggtg 18540 
ttgaactttg aagcccgggg taaccagggc atccctttac tgttsgagac atcccagaag 18600 
gactacgccc tgatcaggac tgttaacgca ggggttcggg acatcatatc cttctcattc 18660 
acgcccttga tttacaatat gctacaaaac gacaccgact ttacggtgtt caggaaaaag 18720 
aacatsgcgg ggttgaattt tgcagtcgtg gagggttttc agcactacca ccacatgags 18780 
gacaccgtgg agaaccttga gccagagacc ttgtttcgot accaaaagac agtgcgtgaa 18840 
gtgggcaacc actttatcca gggtatcgac ctctcctccc tgagtgctga tgaggacgca 18900 
acctatttcc cactgccagg cggcacgctg ttggtactca acttacccac cctgtatgcg 18960 
ctgggcatgg gctcgttcgt gctctgcggt ctttgggcgc aacgctgccg cactcgccga 19020 
cagcatcagg gcaagaattg cgtactgcgc cccatggcta ttgccctgct cggcattgcc 19080 
tgcgcagcac ttgtattcta cgtcccgagc attgcctatc tattcgtcat ccccagtctg 19140 
cttctggctt gcgccatgtt gtcgcgaagc ctctttatct cctattcgat catgctgctg 19200 
ggcgcttatg cctgcgggat actctacgcg cctatcgtct acctgatttc atcaggcctt 19260 
aaaatgccgt tcattgccgg ggtcattgca ctactcccgc tctgcctgct ggccgtggga 19320 
ctggccggcg tcatcgcacg atcgagagac tgtcgaacct gcgactagca agacccgata 19380 
aaacgtcgct tcaaacgoca gatgacgtgc ctcgtcagcc aggcgtggaa ccatctggtg 19440 
gcggcaaatg tgcataaggt gggaacgcag agcgcccgct gcaacacgcc caccccaagc 19500 
accgcgcctc aacggataat caggctcaag ggaattccac cttgcaacct gaaagagcaa 19560 
tcgagcgccc gtcggacaca acaaactgat caccgtcaat tcgggcaagg agcaatocac 19620 
gggcttttgc tccaacctca actccctttg aaaaatcagc cggccacaat ttgcccctac 19680 
cctttcagga tatcctcgat aagcgtttta tcagaacagc gaaaaaccac ttcaagttcg 19740 
tgtacttttt actgcgatct gcgatcgctc ccatggtaca araatgacag atgggaagat 19800 
cgctttaata cctactctcb cacctgagaa aaagtaacca ccgggccgta ttcctgatca 19860 
gacactatcg cctgcacaca aaatttcttc tctggaaaot tactcagcaa caccatcctc 19920 
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catgactgag caataaggtt gcacagttgt tcataaacag cttcatcgac tctatcgctg 19980 

cagcctgcaa aaacatcata caggtgcaca tgattcaaaa ccttctcaac cgccaccaca 20040 

tcacgattac aagactgcat ccaactggag aatgcttcaa cagtaaatcc gctttccaga 20100 

aacacgcccg actcgtaaac aacaaagtct ggaaacaaca cggtggaaaa caccaggaca 20160 

tcttcaggat gtacaacatc actaataaaa caagactcaa ccaactcgga ttgatccttc 20220 

cattgggact tccagcgctc gtatttggga gacctgacag ggtctgtcgt cctgattatt 20280 

ttcatagcac tgttcctgca ctgatgcctt ttggcgattt tttgtttgag cgagaatcga 20340 

tcgatggctc attgttcatc aagaaaaaaa ttctctcgaa acagactctt aagctcggcc 20400 

cggcgcgttg agaacagtga aaagtcactc gccacatcct gaatattctc tgtgaccctc 20460 

acaagatace cagtattcag cttttcaatg ggaaagcctg acgcagcaat ccgctcaaca 20520 

tcgacctctt cagcaaactc atcgccgtag aacatagccc aaccgatatt gggcacgtcc 20580 

ggtttcaaag ccgaattgaa tgagccgatc tgaaaacttt tatatttctc atgcgggcta 20640 

agttcgggtt oactgaacag gtgcagcatt ccgagttgag gaggaaaaat ttcacaccat 20700 

gccttgaaga gcgaatacca atcgacgctc ttgctactga ccgagcggta tgtgattgcg 20760 

ccaggaacca cctgcccccg gacattgcgc gccgtgtgga tgacgctccc acagccttta 20820 

accgcctttt ttctgcgcca ggcaaagtcc aggtagaagt ccgataatgc gccgttaaac 20880 

cgtatggccg cctttgatgc coagcacgct tcactcgcgg ccacgcccat gaaaggctct 20940 

ccgaatttat ccgcattgtg cgacacctgc tctggaacca gtaactcccc ttcccgactt 21000 

aatgacgaaa tgaacgcctc tccaacaccc caaccgatag tttcagactt cgtcctgatt 21060 

gtaatttgca catagggctt cataaatcgt caaagtctcg tcaattcacg ggtgacacaa 21120 

gtatatccaa agagctctcc cacgttactc atcgcatcga gtctatcaac caaccaacgc 21180 

cgcctcacgc gccaccgccc acggcgccat caaccgctgc ggggtctggc aggttttgcg 21240 

cttgagcacg aaattgcggc atttctcggc gaattggtgg cggttgtgca ccatgtccaa 21300 

otgcatctgc gocagctcgg cttcacggca gcgtgtaatc tggtcgatgt ccaacgccgc 21360 

tttacgggcg cgagccacgg cgtacttttc atcggttaac gcgctgcttg cctgctgcat 21420 

cagccaacgg ctgaaggcgt gcgggcagcg cgggccgatg ccctgaacca gcccgtattg 21480 

ctcggcttgc aaggcactga tcggcaggca ggcgtcggtg agttgatggg cgacttcact 21540 

gccaacggcg cgtggcaggc tgtaggtcca gtattcggag ccgtacaggc ccatggtttt 21600 
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gtaatgcggg ttgagtacca cgctctcgcg ggccaatacg atgtctgcgg ccagcgccag 21660 

cattacacca ccggcgccgg cgctgccggt caggccgctg atcaccagtt gccgggccgt 21720 

gagcagttcg tggcacacat cgtacgatgg cctgaatgtt ggcccaggct tccagccccg 21780 

gcactggggc ggcctggatg acgttgaggt gcacaccatt ggaaaagctg ccgcgcccgc 21840 

ccttgatcac cagcacttgg gtgtcccgcg tcttggccca gcgcaacgcc gccaccagtc 21900 

gctggcactg ctcggtgctc atggcgccgt tgtagaactc aaaggtgagt tcaccgacat 21960 

ggccggcttc gcgatagcga atcggttgat aggcttgctc atcgaacatt tgattggcga 2202 0 

tcgagctgtc cagcacggga atatccgcca gtgcttccgc cagcacgtgg cgggccggca 22080 

gcttgaaggt ctcctccccc ggccgggctt tgcgtttgag cgagccgatc cacaggctct 22140 

gatcaccggc cgccaccagc accgcgtcgt cctgcaccgc gaggatctca cccggtgtgc 22200 

cgtggcgcgc atccaggtgc gcgtcgtaca ggtaatactg cccgccctgg atactggcca 22260 

gcacaccggg ctggccatcg gctgcgtcga tgcagcgttt gatgaagcgt gcgcaatcgt 22320 

accaactgaa ggtgcgatca gcctgtgtca tgttcggctg caaacgcccg attacgtggg 22380 

cttgggtgta atcgagcggc accgggacga aaacccgggc gaacttttcc accacgtcgc 22440 

ggatgcaata gagggcggcg tcactcaccg cgccgttgta cagctcggat ttgcgcacat 22500 

cggcaggcat gtcgaattca caggtcgacc agatcggccc ggcgtccatt tcctccaccg 22560 

cctgcaaagc cgtgacgccc cagcggccga cctgctggct gatggcccag tccagcgcgc 22620 

tggcaccacg gtcgccgacg atgcccggat ggataatcac cacagggcgc tcaaggttgc 22680 

tccaaagttg ctgtggcaca cggtctttca gaaaggggca gatcaccagg tcggcgtctg 22740 

aatcctcgat ctgctggcac accaaggctg gatcggtgaa cagaacaacg ctgggcgcgt 22800 

gccccgactg gcgtaaatcc agccaggccc gctgggtcaa accgttgaac gccgacgcta 22860 

acacgatgat cttcaatgac cgcatggctg actcatcctt gagaatgcgc ggccagaggt 22920 

gctccttgag ccctccctgg cctttgatgg aagtacaagg atagttggcg tgccaggcag 22980 

gctacctgat caggatcaat cttgtgtcag cgagtgcttg aacgtaggcg cctgcgttca 23 040 

accaataggc gcatggsctg gcgagtgctc ccgcgtgccc tctgccacaa gggacgccag 23100 

gtattcgcca aacccgccgc gacacttgta gtcccgacgc gcactggtga ccacgggatt 23160 

ggtcgccgtc cacacgatcc gcgcgccaat cctctcgagg tcggccacca actgcacatc 23220 

ttcatgggca accaaatgct ggaacccacc cgcgtttcga taggcatccg cactcaagcc 23280 

caggttggca ccgtgtatat ggcggtggtt ctcggtgaac tgatacaact caaggtagcg 23340 
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cgaacgaacc gattcaccgt actcgctcca gctgtccacc tcgacggttc cgcacaccgc 23400 
atcggcgcca aagccgatct gacgcaccag ccagtcgrcg ggcacaactg tgtcagcgtc 23460 

ggtgaatgcc agccactggg cgccgacttc aagcaatcgc tccgcgccca aggccctggc 23520 
cttgcccaca tttcgaacgc tcacctcaag cgtkgcgaca cccatggccg acacgcgcgt 23580 

ggcggtctcg tccgaacacg catccagcac caccagcaat tggacctgtt ggtgtgccag 23640 

agccggatga gcaatggcgc gctggatgga ggcgaggcag gcactgatgt gccgttcttc 23700 

gttatgggca ggtatcacta tccctatcat tgacgttccc tctaccaggc aaagtgtcta 23760 

cagctatcga ccgggccgtg aggcagaagg tttaaacaat ctgaaggcgc cgccaaacaa 23820 

tgacgtgaga caggtcgcag tgattaaacg gaacgtcaca ggcgccacag gctcagatgg 23880 

tttacgtgtt tgatgcacgg atgaacccgc cattcctaca aacaggtcag ccatcatgtc 23940 

taacgattat caaggtatcg ccagtgtcat cacggcttct cgtcacatgg gtacagactc 24000 

ggatgaacgc cttaatgaga cggtaaatat tcaattgacc tgcagcggta aaccaacgat 24060 

tgcgcggttg agtttcgaca ccccgcttca atggcccggc caccccaact ttgtgctgat 24120 

caacctgccg gacggttcat cggtgggtgg tgtgattgcc gaaattgaaa agtcgaccga 24180 

tggcccgggt tgggtgacgt ttacggtgga tgactgaggt cttcccaaca ggcttcaaat 24240 

cacctccagg cggctgcctc gaatgagaca cacaggccag taatcgagac gcacagacaa 24300 

gcctattttc gcagatacat tttgtaacgt cctatgattg acgcttgctc gaatcaccgc 24360 

agggattggg tggcgtgtgt ttatcacgcc cttgaatccg cagcgaaaaa tgattcgagt 24420 

tcagcgaaca attcgattgg gacaaacaaa aggatgcggg ctatgtcatt gcgtaattta 244 80 

tctttattgg tcaccacact ggcgctgttt aagtggggtg taatgcgctc gcggggcaaa 24540 

acccaacatg ctcagtgatg atgacgtgaa atcccaaagc gccggcgcac tgggctatgc 24600 

cccgacagac ctgagcatcg tcaaccgtcg aaccgaaggc accaacacct acgtgctgct 24660 

taaaaccaac gacaacaagc agttcaactg cattatcaac ggaggcaata tcctgacctt 24720 

cggtatgtcc aacccgcctt cgtgtgcgaa gaaaggtgaa cagatcaaga gtggcccgtt 24780 

cgggagctga tctgtcgctg gaaaaaaggg ccaggccacc tctaagaacg gaggcctggc 24840 

ccttttttat tcgctcagat gagtttaaaa gacaagatat cgggcagctg ggctccggcc 24900 

cgttcagtct gggcacccca cacaaaatgc tcagcgacta cttggccgtc gccgcacacc 24960 

gtttaacggg tgcgacctac agcgtcrccc tggttgaagg cagcaacgaa taaaccctat 25020 
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tgatcggaga gcgaccatgc acccgcataa aaccgcgatt gtcttgattg aataccagaa 25080 
cgacttcaco acccccggcg gcgtgttcca tgacgctgtg aaagacgtca tgcaaacgtc 25140 
caacatgctg gcgaataccg ccaccacgat tgagcaggcc cgcaagctgg gcgtgaagat 25200 
catccactta cccatccgct ttgccgacgg ctacccagag ctgaccctgc gctcatacgg 25260 
cattctcaaa ggcgtcgccg acggcagcgc gtttcgtgcc ggcagctggg gcgccgagat 25320 
caccgacgcg ctgaaacgcg accccaccga tattgtgatc gaaggcaaac gcggcctgga 25380 
tgctttcgcc accaccgggc tggacctggt gctgcgcaac aatggcatcc agaacctggt 25440 
tgtcgcaggt ttcctgacta actgctgcgt tgaaggcacg gttcgatccg gttacgagaa 25500 
aggttatgac gtggtgacct tgaccgactg caccgcgaca ttoagtgatg aacaacagcg 25560 
cgcagccgag cagtttacgt tgccgatgtt tttcgcaaac cctgcaacac accgcgtttc 25620 
tgcaagcact gaacgccgga taaaaaaagc ggcggactcc tgccgagtcg ccgctttttt 25680 
agtgcttggg tcattcggtt ggcgcgtaot gcatttcgcc gttcccaaac gaccagtctt 25740 
cgcgcttcac gtccaccagg ctgatccaca cgtcttcctt gcgcagcccg gtcttggcat 25800 
ggatgccgtc ggcgatgaac ttatagaaag cctttttcac gtcaatgctg cgcccggcgt 25860 
tcoacgtgac ttggataaac acgatcttgg gtgtgtaagt gacgccaaga tacccggccg 25920 
ccgggtaaac cagctcatoc ttggcatggc ggttgatgat ctggaatttg tcgtgctcag 25980 
gcacgttggc ccacactggt catcgcggcg tacacgacat caccgatggc cgtcgcggtt 26040 
tcagtggaag tgtcggcggc gaggtcgatt cgaactaaag gcatggacaa atccttagtg 26100 
attttcagct gaaaatgggc gtgtggctca cacactcgcg ccaaccgggc aacttgcgcc 26160 
aggccaacga gttgctggcc cagggagttg ccgacggttt gcgctagtgc gccgcgaaac 26220 
ttcggcattt gacgcatcgg tgaatggctg accggatgtc agtgcttatt gacctgaata 26280 
tagactgccg tgcacagacc aatcaaacaa ataccggcga tgtagtaagc ggcgcccatc 26340 
tgactgtatt gaagcaatag agtaacgacc atcggcgtca ggccaccgaa tacggcgtac 2 6400 
gacaagttgt aggaaaatga caagccggaa aaccgcacta ccggtggaaa ggcacgcacc 26460 
atcacagcag gggctgcgcc tatcgcgccg acaaaaaaac cggtaagtga atagagtgga 26520 
accagccatt gcgggtgcgt ttcaagcgtc ttgaacaaga gcagtgcgct gaacagaagc 26580 
atgacgctgc cgatcatcaa tacccarccc gcactgaaat gatcggccag tttcccggcg 26640 
atcacgcaac caacactcaa rrcacrcaat agcgagactg ttggcctgca aggcttgcgc 26700 
tgcag 

26705 
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<210> SEQID2 
<211> Lenght: 1004 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE 2 

Met Leu Leu Glu Val Ala Phe His Val He Thr Hie Leu Ser Thr Ser 
^ 10 15 

Gin Leu Val Ser Arg He Glu Arg Val Val Glu Arg His Ala Ser Leu 
20 25 30 

Arg Gin Arg Phe Val Met Arg Asn Gly Thr Tyr Trp He Glu Gin Ala 
•95 40 45 

Pro Pro Gin Gin Arg Arg Tyr Cys Val Val Arg Thr Tyr Asp Glu Ala 
^" 55 60 



Ser Thr Asp Ala Leu Leu Ala Pro Ser Arg Glu His He Gly Val Glu 



70 



75 



80 



ser Glu Arg Leu Phe Arg Ala Glu Val Val Glu Arg Ser Asp Gly Gin 
85 " 



90 



95 



Arg Tyr Leu Val Phe Arg He His His He He Ala Asp Leu Trp Ser 



105 



110 



Val Gly Leu Leu He Arg Asp Phe Ala Glu Asp Cys Met 

120 



125 



Asp Arg Ser 



Ser He Thr Leu Ala Ser Arg Pro He 
130 



Ala Pro Leu He Asp Pro Glu 
140 



Phe Trp Arg His Gin Met Ser Gin Asp Thr Pro Phe Ser Leu Pro Met 



150 



155 



160 



Ala Ser Leu Glu Gin His Thr Asp Arg Arg Met Val Leu Ser Ser Phe 
lo5 



170 



175 



val He Asp Gin Glu Ser Ser Ala Asp Leu Ala Arg Leu Ala Thr 
loO ~ 



185 



Ala 



190 
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Cys Ala Val Thr Pro Tyr Thr Val Met Leu Ala Ala Gin Val Leu Ala 
195 200 205 



Leu Ser Arg lie Gly Gin Ser Gly Arg Leu Ser Leu Ala Val Thr Phe 
210 215 220 

His Gly Arg Asn Arg Gly Asn Lys Asp Ala Val Gly Tyr Phe Ala Asn 
225 230 235 240 

Thr Leu Ala Val Pro Phe Asp Val Ser Glu Cys Ser Val Gly Glu Phe 
245 250 255 

Val Lys Arg Thr Ala Lys Arg Leu Asp Glu Ala Ser Lys Ala Ser Val 
260 265 270 

Gly Ala Gly Tyr Pro Glu Leu Ala Glu Phe Met Thr Pro Leu Gly Trp 
275 280 285 

Ala Ala Thr Ala Pro Thr Asn Ala Val lie Tyr Gin Gin Asp Met Pro 
290 295 300 

Gly Met Pro Arg Gly Leu Ala Ala Ala Leu Leu Gly Leu Gly Thr Val 
305 310 315 320 

Gin Leu Gly Glu Met Ala Leu Thr Ala Glu Gin Ala Pro Pro Ser He 
325 330 335 

Gly Pro Phe Ala Thr Ala Leu Leu Leu Thr Arg His Asp Gly Lys Leu 
340 345 350 

His Gly Arg Val Glu Val Asp Pro Ala Gin His Pro Gly Trp Leu Ala 
355 360 365 

Glu Ala Leu Ala Arg Gin Phe Ala Val He Leu Arg Glu Met Val Arg 
370 375 

Asp Pro Gin Ala Arg Leu Ser Ala Leu Pro Ala Cys Leu Leu His Gin 

390 395 4OO 

Pro Lys Tyr Pro Ser Gin Ala Arg Pro Ala Pro Ala Ser Glu Thr Leu 
405 410 415 
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Val Ala Thr Phe Leu Arg Gin Val Ala He Thr Pro Asp Lys Pro Ala 
420 425 430 

Leu Arg Thr Pro Gin Ala Ser lie Ser Tyr Ser Glu Leu Ala Ser Arg 
435 440 445 

Val Ala Arg Leu Ser Ala Ala Leu Arg Val Arg Gly Phe Lys Pro Glu 
450 455 460 

Gin Thr Leu Ala He Leu Leu Pro Arg Asp He Asn Leu Val Pro Ala 

470 475 480 

Leu Leu Ala He Met Ala Cys Gly Gly Ser Tyr Val Pro Leu Ser Asp 
485 490 495 

Ala Asn Pro Ala Glu Leu Asn Arg Ser He Leu Thr Arg Ala Arei Cvs 
500 505 510 

Arg Ala He Leu Thr Asp Gin Glu Gly Leu Thr Arg Phe Ala His Leu 
515 520 525 

Ala Pro Cys Trp Ser Leu Ser Asp Leu Leu Ser Met Pro Asp Ala Pro 
530 535 

Leu Gin Asp Gin Ser Lys Leu Gin Ala Lys Ala Tyr He Leu Phe Thr 

550 555 560 

Ser Gly Ser Thr Gly Glu Pro Lys Gly Val Ala He Thr His Ala Asn 
565 570 575 

Ala Ala Asn Leu Leu Arg Trp Ala Ala Leu Asp Cys Gly Pro Glu Tvr 
580 585 590 

Leu Ala Gin Thr Leu Ala Ala Thr Pro Thr Thr Phe Asp Leu Ser He 
595 600 605 

Phe Glu Met Phe Ala Pro Leu Met Val Gly Gly Cys Val Gin Pro Val 

615 620 

Ser Ser Val Met Ala Leu He Asp Asn Pro Ala Leu Leu Lys Gly Thr 

"0 635 640 
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Thr Leu He Asn Thr Val Pro Ser Val Ala Asp Ala Leu Leu Gin His 
645 650 655 

Asp Val Leu Val Pro Ser Leu Arg Met Leu Asn Leu Ala Gly Glu Pro 
660 665 670 

Leu Asn Arg Asp Leu Tyr Leu Arg Leu Gin Ala Lys Leu Thr Ala Thr 
675 680 685 

Arg He Val Asn Leu Tyr Gly Pro Thr Glu Thr Thr Thr Tyr Ser Thr 
690 695 700 

Ala Leu Val He Glu Pro Ala Gin Gin Glu He Thr He Gly Phe Pro 

710 715 720 

Leu Tyr Gly Thr Trp Val Asp Val Val Asp Gin Asn Met Gin Ser Val 
725 730 735 

Gly He Gly Val Pro Gly Glu Leu He He His Gly His Gly Val Ala 
740 745 750 

Gin Gly Tyr Val Ser Asp Pro Val Arg Ser Ala Ala Ser Phe Leu Pro 
755 760 765 

Ala Ser Asp Gly Leu Arg Cys Tyr Arg Thr Gly Asp Arg Val Arg Trp 
770 775 

Leu Pro Asp Gly Arg Leu Asp Phe He Gly Arg Glu Asp Asp Gin Val 
■'^S 790 795 800 

Lys val Arg Gly Phe Arg Val Glu Leu Gly Pro Val Gin Ala Ala Leu 
805 810 815 

His Ala He Glu Thr He His Glu Ser Ala Val Val Val Val Pro Lys 
820 825 830 

Gly Gin Gin Arg Ser He Val Ala Phe He Val Leu Lys Ala Pro Ser 
835 840 845 

Glu Asp Glu Ala Val Gin Arg Asn Asn He Lys Gin His Leu Leu Gly 
850 855 860 



Leu Pro Tyr Tyr Ala Leu Pro Asp Lys Phe He Phe Val Lys Ala 
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8" 870 875 



880 



Leu Pro Arg Asn Thr His Gly Lys He Asp Arg Thr Leu Leu Leu Gin 
885 890 895 



His Glu Pro Gin Thr Glu Gin Glu Ser Ala Met Arg Asp Ala Thr Asd 

90S 

Val Glu His Arg He Ala Asn Cys Trp Gin Thr He He Gly His Pro 
915 920 925 

val Gin Leu His Glu Asn Phe Leu Asp He Gly Gly His Ser Leu Ser 



Leu Thr His Leu Thr Gly Leu Leu Arg Lys Glu Phe Asn He His He 



945 950 



960 



Ser Leu His Asp Leu Trp He Arg Pro Thr He Glu Gin Gin Ala Asp 
965 970 

Phe He His Lys Leu Gin Asn Ser Val Leu Thr Lys Pro Ala Ala Ala 
980 985 990 

Pro He Pro Arg Leu Asp Arg Lys He Ser His His 
995 1000 



<210> SEQ ID 3 
<211> Lenght: 1062 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE 3 

Met Ser Val Asp Thr Cys Arg Thr Ala Thr Phe Pro Ala Ser Tyr Gly 
5 10 15 

Gin Glu Gin He Trp Phe Leu Asn Glu Leu Asn Pro His Ser Gin Leu 
^° 25 30 

Ala Tyr Thr Leu Ala Met Lys Val Ser He Ala Gly Lys Leu Asn Thr 

40 45 

Leu ^g Leu Gin Arg Ala Val Asn Gin Val Val Ala Ser Gin Glu He 

55 60 
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Leu Arg Thr Ser Phe Ala Tyr Lys Asn Gin Lys Leu Ser Gin Val lie 
^5 70 75 80 

Ser Pro Ser Ala Thr Leu Pro lie Arg Ser Ala His Cys lie Asp Asp 
85 90 95 

Val Pro Gly Leu Gin Arg Leu lie Asn Met Glu Ala Gin Arq Glv Trr. 

100 105 110 ^ 



Ser Leu Ser Ser Ala Pro Leu Tyr Arg Leu Leu Leu lie Lys Thr Gly 



120 



125 



Asp Gin Gin His Glu Leu Val He Cys Thr His His He Val Cys Asp 
13 0 135 1 ji n 



140 



lis 



ISO 



155 



Gin 
160 



Gly Gin Ser Asp Gly Arg Val Leu Thr Ser Pro Asp Glu Glu Thr Leu 
165 170 

Gin Phe Val Asp Tyr Ala Ala Trp Ser Arg Gin His Glu Tyr Ala Gly 
"0 185 190 



Leu Glu Tyr Trp Arg Gin Gin Leu Ala Asp Ala Pro Thr He Leu Asp 
195 ortft ■6' 



200 



205 



He Ser Thr Lys Thr Gly Arg Ser Glu Gin Gin Thr Phe Leu Gly Ala 
23-0 215 



220 



Arg He Pro Val Glu Phe Ser His His Gin 
225 



Trp Gin Ala Leu Arg Gin 
235 ^ 240 



He Phe Arg Pro Gin Gly He Ser Cys Ala Ala Val Phe Leu Ala Ala 
245 250 



255 



Tyr Cys Val Val Leu His Arg Leu Ala Glu Gin Asp Asp He Leu He 
260 265 



270 



Gly Leu Pro Thr Ser Asn Arg Leu Arg Pro Glu Leu Ala Gin Val He 
275 280 285 
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Gly Tyr Leu Ser Asn Leu Cys Val Phe Arg Ser Gin Tyr Ala His Asp 
290 295 300 

Gin Ser Val Thr Asp Phe Leu Gin Gin Val Gin Leu Thr Leu Pro Asn 
305 310 315 320 

Leu lie Glu His Gly Glu Thr Pro Phe Gin Gin Val Leu Glu Ser Val 
325 330 335 

Glu His Thr Arg Gin Ala Gly Val Thr Pro Leu Cys Gin Val Leu Phe 
340 345 350 

Gly Tyr Glu Gin Asp Val Arg Arg Thr Leu Asp lie Gly Asp Leu Gin 
355 360 365 

Leu Thr Val Ser Asp Val Asp Thr Gly Ala Ala Arg Leu Asp Leu Ser 
370 375 380 

Leu Phe Leu Phe Glu Asp Glu Leu Asn Val Cys Gly Phe Leu Glu Tyr 
385 390 395 400 

Ala Thr Asp Arg lie Asp Ala Ala Ser Ala Gin Asn Met Val Arg Met 
405 410 415 

Leu Ser Ser Val Leu Arg Glu Phe Val Ala Ala Pro Gin Ala Pro Leu 
420 425 430 

Ser Glu Val Gin Leu Gly Ala Ala Asp Ser Gin Ala Gin Thr Pro Ala 
435 440 445 

lie Ala Pro Ala Phe Pro Ser Val Pro Ala Arg Leu Phe Ala Leu Ala 
450 455 460 

Asp Ser His Pro Asn Ala Thr Ala Leu Arg Asp Glu Gin Gly Glu Leu 
465 470 475 48O 

Thr Tyr Ala Gin Val Cys Gin Gin lie Leu Gin Ala Ala Ala Thr Leu 
485 490 495 

Arg Ala Gin Gly Ala Lys Pro Gly Thr Leu lie Ala Val lie Gly Glu 
500 505 510 



Arg Gly Asn Pro Trp Leu lie Ala Met Leu Ala lie Trp Gin Val Gly 
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515 520 



525 



Gly lie Tyr Val Pro Leu Ser Lys Asp Leu Pro Glu Gin Arg Leu Gin 
530 535 540 

Gly lie Leu Ala Glu Leu Glu Gly Ala He Leu He Thr Asp Asp Thr 

550 555 560 

Thr Pro Glu Arg Phe Arg Gin Arg Val Thr Leu Pro Met His Ala Leu 
565 570 575 

Trp Ala Asp Gly Ala Thr His His Glu Arg Gin Thr Thr Asp Ala Ser 
580 585 590 

Arg Leu Ser Gly Tyr Met Met Tyr Thr Ser Gly Ser Thr Gly Lys Pro 
595 600 605 

Lys Gly Val His Val Ser Gin Ala Asn Leu Val Ala Thr Leu Ser Ala 
610 615 620 

Phe Gly Qln Leu Leu Gin Val Lys Pro Ser Asp Arg Met Leu Ala Leu 
"5 630 635 640 

Thr Thr Phe Ser Phe Asp He Ser Leu Leu Glu Leu Leu Leu Pro Leu 
645 650 655 

Val Gin Gly Ala Ser Val Gin He Ala Val Ala Gin Ala Gin Arg Asd 
660 665 670 

Ala Glu Lys Leu Ala Gly Tyr Leu Ala Asp Pro Arg He Thr Leu Val 
675 680 685 

Gin Ala Thr Pro Val Thr Trp Arg Leu Leu Leu Ser Thr Gly Trp Gin 
690 695 700 

Pro Arg Glu Ser Leu Thr Leu Leu Cys Gly Gly Glu Ala Leu Pro Gin 

710 715 720 

Asp Leu Ala Asp Arg Leu Cys Leu Pro Gly Met Thr Leu Trp Asn Leu 
725 730 *^ 



Tyr Gly Pro Thr Glu Thr Thr He Trp Ser Thr Ala Cys Arg Leu Gin 
740 745 
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Pro Gly Ala Pro Val Gin Leu Gly His Pro He Ala Gly Thr Gin He 
■^55 

Ala Leu Val Asp Arg Asn Leu Arg Ser Val Pro Arg Glv Val He Glv 
770 775 780 

Glu Leu Leu He Cys Gly Pro Gly Val Ser Gin Gly Tyr Tyr Arg Asn 

790 795 o«« 



800 



Pro Val Glu Thr Ala Lys Arg Phe Val Pro Asp Pro His Gly Ser Glv 
805 810 815 

Lys Arg Ala Tyr Leu Thr Gly Asp Arg Met Arg Met Gin Gin Asp Gly 
820 825 830 

Ser Leu Ala Tyr He Gly Arg Arg Asp Asp Gin He Lys Leu Arg Gly 
835 840 845 

His Arg He Glu Leu Gly Glu He Glu Thr Ala Leu Arg Lys Leu Pro 
850 855 860 

Gly val Arg Asp Ala Ala Ala Gin Leu His Asp Gin Asp Pro Ser Arg 
®" 870 875 880 

Gly He Gin Ala Phe Val Gin Leu Cys Ala Thr Val Asp Glu Ser Leu 
885 890 895 

He Asp He Gly Gin Trp Leu Glu Thr Leu Arg Gin Thr Leu Pro Glu 

905 910 

Ala Tip Leu Pro Thr Glu Tyr Tyr Arg He Asp Gly He Pro Leu Thr 

920 925 

Tyr Asn Gly Lys Arg Asp Arg Lys Arg Leu Leu His Gin Ala Val Arg 

Leu Gin Thr Leu Ser Leu Arg Val Ala Pro Ser Ser Asp Thr Glu Thr 

550 955 

Arg val Gin Gin He Trp Cys Glu Leu Leu Gly Leu Glu Asp He Gly 
565 970 
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Val Thr Asp Asp Phe Phe Gin Leu Gly Gly His Ser He Leu Val Ala 
980 98S 990 

Arg Met Val Glu Arg He Glu Thr Ala Phe Gly Arg Arg Val Pro He 

1000 1005 

Ala Asp He Tyr Phe Ser Pro Thr He Ala Arg Val Ala Ala Thr 
1010 1015 1020 

Leu Asp Ser Met Thr Phe Glu Gin Gly Leu Ala Ala His Ser Val 
1025 1030 1035 

^^"^ ^^"^ ser Leu Gin His Asn Ala 

^040 1045 1050 

Asp Ser Thr Ala Ala Ala Gin Glu Arg 
1055 1060 



<210> SEQ ID 4 
<211> Length: 1432 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE 4 

Met His Ser Pro Thr He Asp Thr Phe Glu Ala Ala Leu Arg Ser Leu 



^5 10 



15 



Pro Ala Ala Arg Asp Ala Leu Gly Ala Tyr Pro Leu Ser Ser Glu Gin 
20 25 30 

Lys Arg Leu Trp Leu Leu Ala Gin Leu Ala Gly Thr Ala Thr Leu Pro 
35 40 45 

Val Thr val Arg Tyr Ala Phe Thr Gly Thr Val Asp Leu Ala Val Val 
^° 55 60 

Gin Gin Asn Leu Ser Ala Trp He Ala His Ser Glu Ser Leu Arg Ser 



65 70 75 



80 



Leu Phe val Glu Val Leu Glu Arg Pro Val Arg Leu Leu Met Pro Thr 
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85 



90 95 



Gly Leu Val Lys Leu Glu Tyr Phe Asp Arg Pro Pro Ser Asp Ala Asd 
100 105 110 ^ 

Met Ala Glu Leu He Gly Ala Ala Phe Glu Leu Asp Lys Gly Pro Leu 

120 125 

Leu Arg Ala Phe He Thr Arg Thr Ala Ala Gin Gin His Glu Leu His 
130 135 140 

Leu Val Gly His Pro He Val Val Asp Glu Pro Ser Leu Gin Arg He 
1*^ 150 155 j^go 

Ala Gin Thr Leu Phe Gin Thr Glu Pro Asp His Gin Tyr Pro Ala Val 
165 170 

Gly Ala He Ala Glu Val Phe Gin Arg Glu Gin Thr Leu Ala Gin Asd 
180 185 190 

Ala Gin He Thr Glu Gin Trp Gin Gin Trp Gly He Gly Leu Gin Ala 
1^5 200 205 

Pro Ala Ala Thr Glu He Pro Thr Glu Asn Pro Arg Pro Ala He Lys 
210 215 220 

Gly Ser Asp Arg Gin Val His Glu Ala Leu Thr Ala Trp Gly Asp Gin 

235 240 

Pro val Ala Glu Ala Glu He Val Ser Ser Trp Leu Thr Val Leu Met 
245 250 255 

Arg Trp Gin Gly Ser Gin Ser Ala Leu Cys Ala He Lys Val Arg Asp 
2°0 265 270 

Lys Ala His Ala Asn Leu He Gly Pro Leu Gin Thr Tyr Leu Pro Val 
275 280 285 

Arg val Asp Met Pro Asp Gly Ser Thr Leu Ala Gin Leu Arg Leu Gin 

295 300 

val Glu Glu Gin Leu Asn Gly Asn Asp His Pro Ser Phe Ser Thr Leu 

310 315 320 
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Leu Glu Val Cys Pro Pro Lys Arg Asp Leu Ser Arg Thr Pro Tyr Phe 
325 330 335 



Gin Thr Gly Leu Gin Phe He Ala His Asp Val Glu Gin Arg Asp Phe 
340 345 350 



His Ala Gly Asn Leu Thr Arg Leu Pro Thr Lys Gin Pro Ser Ser Asp 
355 360 365 



Leu Asp Leu Phe He Ser Cys Trp Val Ser Asp Gly Thr Leu Gly Leu 
370 375 380 

Thr Leu Asp Tyr Asp Cys Ala Val Leu Asn Ser Ser Gin Val Glu Val 
385 390 395 400 

Leu Ala Gin Ala Leu He Ser Val Leu Ser Ala Pro Gly Glu Gin Pro 
405 410 415 

He Ala Thr Val Ala Leu Met Gly Gin Gin Met Gin Gin Thr Val Leu 
420 425 430 

Ala Gin Ala His Gly Pro Arg Thr Thr Pro Pro Gin Leu Thr Leu Thr 
435 440 445 

Glu Trp Val Ala Ala Ser Thr Glu Lys Ser Pro Leu Ala Val Ala Val 
450 455 460 

He Asp His Gly Gin Gin Leu Ser Tyr Ala Glu Leu Trp Ala Arg Ala 
465 470 475 480 

Ala Leu Val Ala Ala Asn He Ser Gin His Val Ala Lys Pro Arg Ser 
485 490 495 

He He Ala Val Ala Leu Pro Arg Ser Ala Glu Phe He Ala Ala Leu 
500 505 510 

Leu Gly Val Val Arg Ala Gly His Ala Phe Leu Pro He Asp Pro Arg 
515 520 525 



Leu Pro Thr Asp Arg He Gin Phe Leu He Glu Asn Ser Gly Cys Glu 
530 535 540 
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Leu Val He Thr Ser Asp Gin Gin Ser Val Glu Gly Trp Pro Gin Val 

550 555 sgQ 

Ala Arg He Arg Met Glu Ala Leu Asp Pro Asp He Arg Trp Val Ala 
565 570 

Pro Thr Gly Leu Ser His Ser Asp Ala Ala Tyr Leu He Tyr Thr Ser 
580 585 590 

Gly Ser Thr Gly Val Pro Lys Gly Val Val Val Glu His Ar^ Gin Val 
595 600 605 

Val Asn Asn He Leu Trp Arg Gin Arg Thr Trp Pro Leu Thr Ala Gin 
610 615 620 

Asp Asn Val Leu His Asn His Ser Phe Ser Phe Asp Pro Ser Val Tro 

630 635 640 

Ala Leu Phe Trp Pro Leu Leu Thr Gly Gly Thr He Val Leu Ala Asp 
6*5 650 655 

Val Arg Thr Met Glu Asp Ser Thr Ala Leu Leu Asp Leu Met He Arg 
660 665 670 

His Asp Val Ser Val Leu Gly Gly Val Pro Ser Leu Leu Gly Thr Leu 
675 680 685 

He Asp His Pro Phe Ala Asn Asp Cys Arg Ala Val Lys Leu Val Leu 
690 695 700 

Ser Gly Gly Glu Val Leu Asn Pro Glu Leu Ala His Lys He Gin Lys 

710 715 720 

Val Trp Gin Ala Asp Val Ala Asn Leu Tyr Gly Pro Thr Glu Ala Thr 
725 730 

He Asp Ala Leu Tyr Phe Ser He Asp Lys Asn Ala Ala Gly Ala He 
740 745 

Pro He Gly Tyr Pro He Asp Asn Thr Asp Ala Tyr He Val Asp Leu 
^55 760 765 
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Asn Leu Asn Pro Val Pro Pro Gly Val Pro Qly Glu He Met Leu Ala 
770 775 780 

Gly Gin Asn Leu Ala Arg Gly Tyr Leu Gly Lys Pro Ala Gin Thr Ala 
■^85 790 795 800 

Gin Arg Phe Leu Pro Asn Pro Phe Gly Asn Gly Arg Val Tyr Ala Thr 
805 810 815 

Qly Asp Leu Gly Arg Arg Trp Ser Ser Gly Ala He Ser Tyr Leu Gly 
820 825 830 

Arg Arg Asp Gin Gin Val Lys He Arg Gly His Arg He Glu Leu Asn 
835 840 845 

Glu Val Ala His Leu Leu Cys Gin Ala Leu Glu Leu Lys Glu Ala He 
850 855 8S0 

Val Phe Ala Gin His Ala Gly Thr Glu Gin Ala Arg Leu Val Ala Ala 

870 875 880 

He Glu Gin Gin Pro Gly Leu His Ser Glu Gly He Lys Gin Glu Leu 
885 890 895 

Leu Arg His Leu Pro Ala Tyr Leu He Pro Ser Gin Leu Leu Leu Leu 
900 90S 910 

Asp Glu Leu Pro Arg Thr Ala Thr Gly Lys Val Asp Met Leu Lys Leu 
915 920 925 

Asp Gin Leu Ala Ala Pro Gin Leu Asn Asp Ala Gly Gly Thr Glu Cvs 
"0 935 940 ^ 

Arg Ala Pro Arg Thr Asp Leu Glu Gin Ser Val Met Thr Asp Phe Ala 

950 955 960 

Gin Val Leu Gly Leu Thr Ala Val Thr Pro Asp Thr Asp Phe Phe Glu 
965 970 

Gin Gly Gly Asn Ser He Leu Leu Thr Arg Leu Ala Gly Thr Leu Ser 
980 985 990 

Ala Lys Tyr Gin Val Gin He Pro Leu His Glu Phe Phe Leu Thr Pro 
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995 1000 1005 

Thr Pro Ala Ala Val Ala Gin Ala He Glu He Tyr Arg Arg Glu 
1010 1015 1020 

Gly Leu Thr Ala Leu Leu Ser Arg Gin Hie Ala Gin Thr Leu Glu 
1025 1030 1035 

Gin Asp He Tyr Leu Glu Glu His He Arg Pro Asp Gly Leu Pro 
1040 1045 1050 

His Ala Asn Trp Tyr Gin Pro Ser Val Val Phe Leu Thr Gly Ala 
1055 106.0 1065 

Thr Gly Tyr Leu Gly Leu Tyr Leu He Glu Gin Leu Leu Lys Arg 
1070 1075 1080 

Thr Thr Ser Arg Val He Cys Leu Cys Arg Ala Lys Asp Ala Glu 
1085 1090 1095 

His Ala Lys Ala Arg He Leu Glu Gly Leu Lys Thr Tyr Arg He 
1100 1105 1110 

Asp Val Gly Ser Glu Leu His Arg Val Glu Tyr Leu Thr Gly Aso 
1115 1120 1125 

Leu Ala Leu Pro His Leu Gly Leu Ser Glu His Gin Trp Gin Thr 
1130 1135 1140 

Leu Ala Glu Glu Val Asp Val He Tyr His Asn Gly Ala Leu Val 
1145 1150 1155 

Asn Phe Val Tyr Pro Tyr Ser Ala Leu Lys Ala Thr Asn Val Gly 
1160 1165 1170 

Gly Thr Gin Ala He Leu Glu Leu Ala Cys Thr Ala Arg Leu Lvs 
1175 1180 1185 

Ser Val Gin Tyr Val Ser Thr Val Asp Thr Leu Leu Ala Thr His 
1190 1195 1200 

Val Pro Arg Pro Phe He Glu Asp Asp Ala Pro Leu Arg Ser Ala 
1205 1210 1215 
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Val Gly Val Pro Val Gly Tyr Thr Gly Ser Lys Trp Val Ala Glu 
1220 1225 1230 

Gly Val Ala Asn Leu Gly Leu Arg Arg Gly He Pro Val Ser He 
1235 1240 1245 

Phe Arg Pro Gly Leu He Leu Gly His Thr Glu Thr Gly Ala Ser 
1250 1255 1260 

Gin Ser He Asp Tyr Leu Leu Val Ala Leu Arg Gly Phe Leu Pro 
1265 1270 1275 

Met Gly He Val Pro Asp Tyr Pro Arg He Phe Asp He Val Pro 
1280 1285 1290 

Val Asp Tyr Val Ala Ala Ala He Val His He Ser Met Gin Pro 
1295 1300 1305 

Gin Gly Arg Asp Lys Phe Phe His Leu Phe Asn Pro Ala Pro Val 
1310 1315 1320 

Thr He Arg Gin Phe Cys Asp Trp He Arg Glu Phe Gly Tyr Glu 
1325 1330 1335 

Phe Lys Leu Val Asp Phe Glu His Gly Arg Gin Gin Ala Leu Ser 
1340 1345 1350 

Val Pro Pro Gly His Leu Leu Tyr Pro Leu Val Pro Leu He Ara 
1355 1360 1365 

Asp Ala Asp Pro Leu Pro His Arg Ala Leu Asp Pro Asp Tyx He 
1370 1375 1380 

His Glu Val Asn Pro Ala Leu Glu Cys Lys Gin Thr Leu Glu Leu 
1385 1330 1395 

Leu Ala Ser Ser Asp He Thr Leu Ser Lys Thr Thr Lys Ala Tyr 
1400 1405 1410 

Ala His Thr He Leu Arg Tyr Leu He Asp Thr Gly Phe Met Ala 
1415 1420 1425 
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Lys Pro Gly Val 
1430 



<210> SEQ ID 5 
<211> Lenght: 350 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCES 

Met Glu Ser He Ala Phe Pro He Ala His Lys Pro Phe He Leu Gly 
^5 10 15 

Cys Pro Glu Asn Leu Pro Ala Thr Glu Arg Ala Leu Ala Pro Ser Ala 
20 25 30 

Aid Met Ala Arg Gin Val Leu Glu Tyr Leu Glu Ala Cys Pro Gin Ala 
35 40 45 

Lys Asn Leu Glu Gin Tyr Leu Gly Thr Leu Arg Glu Val Leu Ala His 
50 55 60 

Leu Pro Cys Ala Ser Thr Gly Leu Met Thr Asp Asp Pro Arg Glu Asn 
^5 70 75 80 

Gin Glu Asn Arg Asp Asn Asp Phe Ala Phe Gly He Glu Arg His Gin 
85 90 95 

Gly Asp Thr Val Thr Leu Met Val Lys Ala Thr Leu Asp Ala Ala He 
100 105 110 

Gin Thr Gly Glu Leu Val Gin Arg Ser Gly Thr Ser Leu Asp His Ser 

120 125 

Glu Trp Ser Asp Met Met Ser Val Ala Gin Val He Leu Gin Thr He 
130 135 140 

Ala Asp Pro Arg Val Met Pro Glu Ser Arg Leu Thr Phe Gin Ala Pro 

150 155 160 

Lys Ser Lys Val Glu Glu Asp Asp Gin Asp Pro Leu Arg Arg Trp Val 
1S5 170 175 
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Arg Gly His Leu Leu Phe Met Val Leu Cys Gin Gly Met Ser Leu Cys 
180 185 190 

Thr Asn Leu Leu He Ser Ala Ala His Asp Lys Asp Leu Glii Leu Ala 
155 200 205 

Cys Ala Gin Ala Asn Arg Leu He Gin Leu Met Asn He Ser Arg He 
210 215 220 

Thr Leu Glu Phe Ala Thr Asp Leu Asn Ser Gin Gin Tyr Val Ser Gin 
225 230 235 240 

He Arg Pro Thr Leu Met Pro Ala He Ala Pro Pro Lys Met Ser Gly 
245 250 255 

He Asn Trp Arg Asp His Val Val Met He Arg Trp Met Arg Gin Ser 
260 265 270 

Thr Asp Ala Trp Asn Phe He Glu Gin Ala Tyr Pro Gin Leu Ala Glu 
275 280 285 

Arg Met Arg Thr Thr Leu Ala Gin Val Tyr Ser Ala His Arg Gly Val 
290 295 300 

Cys Glu Lys Phe Val Gly Glu Glu Asn Thr Ser Leu Leu Ala Lys Glu 
305 310 315 320 

Asn Ala Thr Asn Thr Ala Gly Gin Val Leu Glu Asn Leu Lys Lys Ser 
325 330 335 

Arg Leu Lys Tyr Leu Lys Thr Lys Gly Cys Ala Gly Ala Gly 
340 345 350 



<210> SEQ ID 6 
<211> Lenght: 61 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 6 

Met Pro Thr Phe Leu Gly Asp Asp Asp Ala Val Pro Cys Val Val Val 
15 10 15 
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Val Asn Ala Asp Lys His Tyr Ser He Trp Pro Ser Ala Arg Asp He 
20 25 30 

Pro Ser Gly Trp Ser Glu Glu Gly Phe Lys Gly Ser Arg Ser Asp Cys 
35 40 45 

Leu Glu His He Ala Gin He Trp Pro Glu Pro Thr Ala 
50 55 60 



<210>SEQID7 
<211> Lenght: 355 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE 7 

Met Thr Ser Thr His Arg Thr Thr Asp Gin Val Lys Pro Ala Val Leu 
^5 10 15 

Asp Met Pro Gly Leu Ser Gly He Leu Phe Gly His Ala Ala Phe Gin 
20 25 30 

Tyr Leu Arg Ala Ser Cys Glu Leu Asp Leu Phe Glu His Val Arg Asd 
35 40 45 

Leu Arg Glu Ala Thr Lys Glu Ser He Ser Ser Arg Leu Lys Leu Gin 
50 55 60 

Glu Arg Ala Ala Asp He Leu Leu Leu Gly Ala Thr Ser Leu Gly Met 

70 75 80 

Leu Val Lys Glu Asn Gly He Tyr Arg Asn Ala Asp Val Val Glu Asp 
85 90 95 

Leu Met Ala Hhr Asp Asp Trp Gin Arg Phe Lys Asp Thr Val Ala Phe 
100 105 110 

Glu Asn Tyr He Val Tyr Glu Gly Gin Leu Asp Phe Thr Glu Ser Leu 
115 120 125 

Gin Lys Asn Thr Asn Val Gly Leu Gin Arg Phe Pro Gly Glu Gly Arg 
130 135 j^4Q 
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Asp Leu Tyr His Arg Leu His Gin Asn Pro Lys Leu Glu Asn Val Phe 
"5 150 155 160 

Tyr Arg Tyr Met Arg Ser Trp Ser Glu Leu Ala Asn Gin Asp Leu Val 
165 170 175 

Lys His Leu Asp Leu Ser Arg Val Lys Lys Leu Leu Asp Ala Gly Gly 
180 185 190 

Gly Asp Ala Val Asn Ala lie Ala Leu Ala Lys His Asn Glu Gin Leu 
195 200 205 

Asn Val Thr Val Leu Asp He Asp Asn Ser He Pro Val Thr Gin Glv 
210 215 220 

Lys He Asn Asp Ser Gly Leu Ser His Arg Val Lys Ala Gin Ala Leu 

230 235 240 

Asp He Leu His Gin Ser Phe Pro Glu Gly Tyr Asp Cys He Leu Phe 
245 250 255 

Ala His Gin Leu Val He Trp Thr Leu Glu Glu Asn Thr His Met Leu 
260 265 270 

Arg Lys Ala Tyr Asp Ala Leu Pro Glu Gly Gly Arg Val Val He Phe 
275 280 285 

Asn Ser Met Ser Asn Asp Glu Gly Asp Gly Pro Val Met Ala Ala Leu 
250 295 300 

Asp Ser Val Tyr Phe Ala Cys Leu Pro Ala Glu Gly Gly Met He Tyr 



305 310 315 



320 



Ser Trp Lys Gin Tyr Glu Val Cys Leu Ala Glu Ala Gly Phe Lys Asn 
325 330 335 

Pro Val Arg Thr Ala He Pro Gly Trp Thr Pro His Gly He He Val 
340 345 



Ala Tyr Lys 
355 
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<210> SEQ ID 8 
<211> Lenght347 
<212> Type; PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE 8 

Met Ala Arg Ser Pro Glu Thr Asn Ser Ala Met Pro Gin Gin He Ara 
1 5 10 15 ^ 

Gin Leu Leu Tyr Ser Gin Leu He Ser Gin Ser lie Gin Tlir Phe Cvs 
20 25 30 

Glu Leu Arg Leu Pro Asp Val Leu Gin Ala Ala Gly Gin Pro Thr Ser 
35 40 45 

He Glu Arg Leu Ala Glu Gin Thr His Thr His He Ser Ala Leu Ser 
^° 55 60 

Arg Leu Leu Lys Ala Leu Lys Pro Phe Gly Leu Val Lys Glu Thr Asp 

'0 75 80 

Glu Gly Phe Ser Leu Thr Asp Leu Gly Ala Ser Leu Thr His Asp Ala 
85 90 95 

Phe Ala Ser Ala Gin Pro Ser Ala Leu Leu He Asn Gly Glu Met Glv 
100 105 110 

Gin Ala Trp Arg Gly Met Ala Gin Thr He Arg Thr Gly Glu Ser Ser 
115 120 125 

Phe Lys Met Tyr Tyr Gly He Ser Leu Phe Glu Tyr Phe Glu Gin His 
130 135 

Pro Glu Arg Arg Ala He Phe Asp Arg Ser Gin Asp Met Gly Leu Asp 

155 160 

Leu Glu He Pro Glu He Leu Glu Asn He Asn Leu Asn Asp Gly Glu 

170 175 

Asn He Val Asp Val Gly Gly Gly Ser Gly His Leu Leu Met His Met 
180 185 190 
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Leu Asp Lys Trp Pro Glu Ser Thr Gly He Leu Phe Asp Leu Pro Val 
1^5 200 205 

Ala Ala Lys He Ala Gin Gin His Leu His Lys Ser Gly Lys Ala Gly 
210 215 220 



Cys Phe Glu He Val Ala Gly Asp Phe Phe Lys Ser Leu Pro Asp Ser 
225 230 235 240 

Gly Ser Val Tyr Leu Leu Ser His Val Leu His Asp Trp Gly Asp Glu 
245 250 255 

Asp Cys Lys Ala He Leu Ala Thr C!ys Arg Arg Ser Met Pro Asp Asn 
260 265 270 

Ala Leu Leu Val Val Val Asp Leu Val He Asp Gin Ser Glu Ser Ala 
275 280 285 

Gin Pro Asn Pro Thr Gly Ala Met Met Asp Leu Tyr Met Leu Ser Leu 
290 295 300 

Phe Gly He Ala Gly Gly Lys Glu Arg Asn Glu Asp Glu Phe Arg Thr 

310 315 320 

Leu He Glu Asn Ser Gly Phe Asn Val Lys Gin Val Lys Arg Leu Pro 
325 330 335 

Ser Gly Asn Gly He He Phe Ala Tyr Pro Lys 
340 345 



<210> SEQID9 
<211> Lenght: 180 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 9 

Met Ser Thr Leu Val Tyr Tyr Val Ala Ala Thr Leu Asp Gly Tyr He 
^5 10 15 

Ala Thr Gin Gin His Lys Leu Asp Trp Leu Glu Asn Phe Ala Leu Gly 
20 25 30 
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Asp Asp Ala Thr Ala Tyx Asp Asp Phe Tyr Gin Thr lie Gly Ala Val 
35 40 45 

Val Met Gly Ser Gin Thr Tyr Glu Trp lie Met Ser Asn Ala Pro Asp 
50 55 60 

Asp Trp Pro Tyr Gin Asp Val Pro Ala Phe Val Met Ser Asn Arg Asp 
65 70 75 80 

Leu Ser Ala Pro Ala Asn Leu Asp lie Thr Phe Leu Arg Gly Asp Ala 
85 90 95 

Ser Ala He Ala Val Arg Ala Arg Gin Ala Ala Lys Gly Lys Asn Val 
100 105 110 

Trp Leu Val Gly Gly Gly Lys Thr Ala Ala Cys Phe Ala Asn Ala Gly 
115 120 125 

Glu Leu Gin Gin Leu Phe He Thr Thr He Pro Thr Phe He Gly Thr 
130 135 140 

Gly Val Pro Val Leu Pro Val Asp Arg Ala Leu Glu Val Val Leu Arg 
145 150 155 160 

Glu Gin Arg Thr Leu Gin Ser Gly Ala Met Glu Cys He Leu Asp Val 
165 170 175 

Lys Lys Ala Asp 
180 



<210> SEQ ID 10 
<211> Length: 220 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 10 

Met Ser Asn Val Phe Ser Gly Gly Lys Gly Asn Gly Asn Pro Gly Phe 
15 10 15 

Val Arg Thr Phe Ser Arg He Ala Pro Thr Tyr Glu Glu Lys Tyr Gly 
20 25 30 



Thr Lys Leu Ser Gin Ala His Asp Asp Cys Leu Arg Met Leu Ser Arg 
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35 40 45 

Trp Met Cys Thr Ser Arg Pro Glu Arg Val Leu Asp He Gly Cys Gly 
50 55 SO 

Thr Gly Ala Leu He Glu Arg Met Phe Ala Leu Trp Pro Glu Ala Arg 
^5 70 75 80 

Phe Glu Gly Val Asp Pro Ala Gin Gly Met Val Asp Glu Ala Ala Lys 
85 90 95 

Arg Arg Pro Phe Ala Ser Phe Val Lys Gly Val Ala Glu Ala Leu Pro 
100 105 110 

Phe Pro Ser Gin Ser Met Asp Leu Val Val Cys Ser Met Ser Phe Gly 
115 120 125 

His Trp Ala Asp Lys Ser Val Ser Leu Asn Glu Val Arg Arg Val Leu 
130 135 140 

Lys Pro Gin Gly Leu Phe Cys Leu Val Glu Asn Leu Pro Ala Gly Trp 

150 155 160 

Gly Leu Thr Thr Leu He Asn Trp Leu Leu Gly Ser Leu Ala Asp Tyr 
165 170 175 

Arg Ser Glu His Glu Val He Gin Leu Ala Gin Thr Ala Gly Leu Gin 
180 185 190 

Ser Met Glu Thr Ser Val Thr Asp Gin His Val He Val Ala Thr Phe 
195 200 205 

Arg Pro Cys Cys Gly Glu Val Gly Asp His Gly Arg 
210 215 220 



<210> SEQID 11 
<211> Length: 509 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 11 



Met Val Val Lys Asn Lys Gin Val Leu Val Val Gly Ala Gly Pro Val 
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15 10 15 

Gly Leu Ala Val Ala Ala Ala Leu Ala Glu Leu Gly lie Ala Val Asp 
20 25 30 

Leu He Asp Lys Arg Pro Ala Ala Ser Pro His Ser Arg Ala Phe Gly 
35 40 45 

Leu Glu Pro Val Thr Leu Glu Leu Leu Asn Ala Trp Gly Val Ala Asp 
50 55 60 

Glu Met He Arg Arg Gly He Val Trp Ala Ser Ala Pro Leu Gly Asp 
^5 70 75 80 

Lys Ala Gly Arg Thr Leu Ser Phe Ser Lys Leu Pro Cys Glu Tyr Pro 
85 90 95 

His Met Val He He Pro Gin Ser Gin Thr Glu Ser Val Leu Thr Asp 
100 105 110 

Trp Val Asn Arg Lys Gly Val Asn Leu Lys Arg Gly Tyr Ala Leu Lys 
115 120 125 

Ala Leu Asp Ala Gly Asp Leu His Val Glu Val Thr Leu Glu His Ser 
130 135 140 

Glu Thr Gly Ser Val Gin Gin Ser Arg Tyr Asp Trp Val Leu Gly Ala 

150 155 160 

Asp Gly Val Asn Ser Ser Val Arg Gin Leu Leu Asn He Ser Phe Val 
165 170 175 

Gly Gin Asp Tyr Lys His Ser Leu Val Val Ala Asp Val Val Leu Arg 
180 185 190 

Asn Pro Pro Ser Pro Ala Val His Ala Arg Ser Val Ser Arg Gly Leu 
195 200 205 

Val Ala Leu Phe Pro Leu Pro Asp Gly Ser Tyr Arg Val Ser He Glu 
210 215 220 

Asp Asn Glu Arg Met Asp Thr Pro Val Lys Gin Pro Val Thr His Glu 
225 230 235 240 
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Glu He Ala Gly Gly Met Lys Asp He lieu Gly Thr Asp Phe Gly Leu 
245 250 255 



Ala Gin Val Leu Trp Ser Ala Arg Tyr Arg Ser Gin Gin Arg Leu Ala 
260 265 270 



Thr His Tyr Arg Gin Gly Arg Val Phe Leu Leu Gly Asp Ala Ala His 
275 280 285 



Thr His Val Pro Ala Gly Gly Gin Gly Leu Gin Met Gly He Gly Asp 
290 295 300 



Ala Ala Asn Leu Ala Trp Lys Leu Ala Gly Val He Gin Ala Thr Leu 
305 310 315 320 



Pro Met Asp Leu Leu Glu Ser Tyr Glu Ala Glu Arg Arg Pro He Ala 
325 330 335 

Ala Ala Ala Leu Arg Asn Thr Asp Leu Leu Phe Arg Phe Asn Thr Ala 
340 345 350 



Ser Gly Pro He Gly Arg Leu He His Trp He Gly Leu Gin Ala Thr 
355 360 365 

Arg Ala Pro Tyr Val Ala Gin Lys Val Val Ser Ala Leu Ala Gly Glu 
370 375 380 



Gly Val Arg Tyr Asp Ser Val Arg Arg Arg Gly Asp His Arg Leu Val 
385 390 395 400 

Gly Arg Arg Leu Pro Leu Leu Ser Leu Leu Pro Glu Gly Glu Arg Leu 
405 410 415 

Pro Arg Gin Ser Leu Thr Gin Leu Leu Arg Ala Gly Arg Phe Val Leu 
420 425 430 

Val His His Arg Ala Lys Ala Leu Ala Ala Asp Leu Arg Arg Asp Phe 
435 440 445 



Pro Gly Leu Gin Thr Ala Ser He Cfys Glu Asp Ser His Asn Asn Ser 
450 455 460 
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Leu ser Ala Gly Glu Gly Val lie Val Arg Pro Asp Gly Val Val lie 

470 475 480 

Trp Val Gly Lys Lys Ser Thr Leu Ala Lys Glu Arg Leu Gly Glu Trp 
485 490 495 

Leu Leu Asp Asp Ser Lys Ser Ala Arg Gin Ser Leu Thr 
500 505 



<210> SEQ ID 12 
<21 1> LENGHT: 348 
<212> TYPE: PRT 

<2 13> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 12 

Met Ala His Tyr Asp Ser Val Gly Thr Ala Pro Gly Ala Ser Asp Asp 
Is 10 15 

Gly Met Ala Val Ala Ser lie Leu Gin Leu Met Arg Glu Thr lie Thr 
20 25 30 

Arg Ser Asp Ala Lys Asn Asn Val Val Phe Leu Leu Ala Asp Gly Glu 
.35 40 45 

Glu Leu Gly Leu Leu Gly Ala Glu His Tyr Val Ser Gin Leu Ser Thr 
50 55 60 

Pro Glu Arg Glu Ala He Arg Leu Val Leu Asn Phe Glu Ala Arg Gly 

"70 75 

Asn Gin Gly He Pro Leu Leu Phe Glu Thr Ser Gin Lys Asp Tyr Ala 
85 90 95 

Leu He Arg Thr Val Asn Ala Gly Val Arg Asp He He Ser Phe Ser 
100 105 110 

Phe Thr Pro Leu He Tyr Asn Met Leu Gin Asn Asp Thr Asp Phe Thr 
1" 120 125 

Val Phe Arg Lys Lys Asn He Ala Gly Leu Asn Phe Ala Val Val Glu 
"0 135 140 
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Gly Phe Gin His Tyr His His Met Ser Asp Thr Val Glu Asn Leu Gly 
145 150 155 160 

Pro Glu Thr Leu Phe Arg Tyr Gin Lys Thr Val Arg Glu Val Gly Asn 
165 170 175 

His Phe He Gin Gly He Asp Leu Ser Ser Leu Ser Ala Asp Glu Asp 
180 185 190 

Ala Thr Tyr Phe Pro Leu Pro Gly Gly Thr Leu Leu Val Leu Asn Leu 
195 200 205 

Pro Thr Leu Tyr Ala Leu Gly Met Gly Ser Phe Val Leu Cys Gly Leu 
210 215 220 

Trp Ala Gin Arg Cys Arg Thr Arg Arg Gin His Gin Gly Lys Asn Cys 
225 230 235 240 

Val Leu Arg Pro Met Ala He Ala Leu Leu Gly He Ala Cys Ala Ala 
245 250 255 

Leu Val Phe Tyr Val Pro Ser He Ala Tyr Leu Phe Val He Pro Ser 
260 265 270 

Leu Leu Leu Ala Cys Ala Met Leu Ser Arg Ser Leu Phe He Ser Tyr 
275 280 285 

Ser He Met Leu Leu Gly Ala Tyr Ala Cys Gly He Leu Tyr Ala Pro 
290 295 300 

He Val Tyr Leu He Ser Ser Gly Leu Lys Met Pro Phe He Ala Gly 
305 310 315 320 

Val He Ala Leu Leu Pro Leu Cys Leu Leu Ala Val Gly Leu Ala Gly 
325 330 335 



Val He Ala Arg Ser Arg Asp Cys Arg Thr Cys Asp 
340 345 
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<210> SEQ ID 13 
<211> Lenght: 572 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 13 

Met Arg Ser Leu Lys He He Val Leu Ala Ser Ala Phe Asn Gly Leu 

Thr Gin Arg Ala Trp Leu Asp Leu Arg Gin Ser Gly His Ala Pro Ser 
20 25 30 

Val Val Leu Phe Thr Asp Pro Ala Leu Val Cys Gin Gin He Glu Asp 
35 40 45 

Ser Asp Ala Asp Leu Val He Cys Pro Phe Leu Lys Asp Arg Val Pro 
50 55 60 

Gin Gin Leu Trp Ser Asn Leu Glu Arg Pro Val Val He He His Pro 

70 75 ao 

Gly He Val Gly Asp Arg Gly Ala Ser Ala Leu Asp Trp Ala He Ser 
85 90 95 

Gin Gin Val Gly Arg Trp Gly Val Thr Ala Leu Gin Ala Val Glu Glu 
100 105 110 

Met Asp Ala Gly Pro He Trp Ser Thr Cys Glu Phe Asp Met Pro Ala 

120 125 

Asp val Arg Lys Ser Glu Leu Tyr Asn Gly Ala Val Ser Asp Ala Ala 

135 140 

Leu Tyr Cys He Arg Asp Val Val Glu Lys Phe Ala Arg Val Phe Val 

155 160 



Pro Val Pro Leu Asp Tyr Thr Gin Ala His Val He Gly Arg Leu Gin 
1^5 170 3^75 

Pro Asn Met Thr Gin Ala Asp Arg Thr Phe Ser Trp Tyr Asp Cys Ala 
1^0 185 190 

Arg Phe He Lys Arg Cys He Asp Ala Ala Asp Gly Gin Pro Gly Val 
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195 200 205 

Leu Ala Ser He Gin Gly Gly Gin Tyr Tyr Leu Tyr Asp Ala His Leu 
210 215 220 

Asp Ala Arg His Gly Thr Pro Gly Glu He Leu Ala Val Gin Asp Asp 
225 230 235 240 

Ala Val Leu Val Ala Ala Gly Asp Gin Ser Leu Trp He Gly Ser Leu 
245 250 255 

Lys Arg Lys Ala Arg Pro Gly Glu Glu Thr Phe Lys Leu Pro Ala Arg 
260 265 270 

His Val Leu Ala Glu Ala Leu Ala Asp He Pro Val Leu Asp Ser Ser 
275 280 285 

He Ala Asn Gin Met Phe Asp Glu Gin Ala Tyr Gin Pro He Arg Tyr 
290 295 300 

Arg Glu Ala Gly His Val Gly Glu Leu Thr Phe Glu Phe Tyr Asn Glv 
305 310 315 320 

Ala Met Ser Thr Glu Gin Cys Gin Arg Leu Val Ala Ala Leu Arg Trp 
325 330 335 

Ala Lys Thr Arg Asp Thr Gin Val Leu Val He Lys Gly Gly Arg Gly 
340 345 350 

Ser Phe Ser Asn Gly Val His Leu Asn Val He Gin Ala Ala Pro Val 
355 360 365 

Pro Gly Leu Glu Ala Trp Ala Asn He Gin Ala He Tyr Asp Val Cys 
370 375 380 

His Glu Leu Leu Thr Ala Arg Gin Leu Val He Ser Gly Leu Thr Glv 

390 395 400 

Ser Ala Gly Ala Gly Gly Val Met Leu Ala Leu Ala Ala Asp He Val 
405 410 415 

Leu Ala Arg Glu Ser Val Val Leu Asn Pro His Tyr Lys Thr Met Gly 
420 425 430 
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Leu Tyr Gly Ser Glu Tyr Trp Thr Tyr Ser Leu Pro Arg Ala Val Gly 
435 440 445 

Ser Glu Val Ala His Gin Leu Thr Asp Ala Cys Leu Pro He Ser Ala 
450 455 460 

Leu Gin Ala Glu Gin Tyr Gly Leu Val Gin Gly He Gly Pro Arg Cys 

470 475 480 

Pro His Ala Phe Ser Arg Trp Leu Met Gin Gin Ala Ser Ser Ala Leu 
485 490 495 

Thr Asp Glu Lys Tyr Ala Val Ala Arg Ala Arg Lys Ala Ala Leu Asp 
500 505 510 

He Asp Gin He Thr Arg Cys Arg Glu Ala Glu Leu Ala Gin Met Gin 
515 520 525 

Leu Asp Met Val His Asn Arg His Gin Phe Ala Glu Lys Cys Arg Asn 
530 535 540 

Phe Val Leu Lys Arg Lys Thr Cys Gin Thr Pro Gin Arg Leu Met Ala 
545 550 555 560 

Pro Trp Ala Val Ala Arg Glu Ala Ala Leu Val Gly 
565 570 



<210> SEQ ID 14 
<211> Lenght: 230 
<212> Type: PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 14 

Met He Gly He Val He Pro Ala His Asn Glu Glu Arg His He Ser 
15 10 15 

Ala Cys Leu Ala Ser He Gin Arg Ala He Ala His Pro Ala Leu Ala 
20 25 30 



His Gin Gin Val Gin Leu Leu Val Val Leu Asp Ala Cys Ser Asp Glu 
35 40 45 
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Thr Ala Thr Arg Val Ser Ala Met Gly Val Ala Thr Leu Glu Val Ser 
50 55 60 

Val Arg Asn Val Gly Lys Ala Arg Ala Leu Gly Ala Glu Arg Leu Leu 
65 70 75 80 

Glu Val Gly Ala Gin Trp Leu Ala Phe Thr Asp Ala Asp Thr Val Val 
85 90 95 

Pro Ala Asp Trp Leu Val Arg Gin lie Gly Phe Gly Ala Asp Ala Val 
100 105 110 

Cys Gly Thr Val Glu Val Asp Ser Trp Ser Glu Tyr Gly Glu Ser Val 
115 120 125 

Arg Ser Arg Tyr Leu Glu Leu Tyr Gin Phe Thr Glu Asn His Arg His 
130 135 140 

He His Gly Ala Asn Leu Gly Leu Ser Ala Asp Ala Tyr Arg Asn Ala 
"5 150 155 160 

Gly Gly Phe Gin His Leu Val Ala His Glu Asp Val Gin Leu Val Ala 
165 170 175 

Asp Leu Glu Arg He Gly Ala Arg He Val Trp Thr Ala Thr Asn Pro 
180 185 190 

Val Val Thr Ser Ala Arg Arg Asp Tyr Lys Cys Arg Gly Gly Phe Gly 
195 200 205 

Glu Tyr Leu Ala Ser Leu Val Ala Glu Gly Thr Arg Glu His Ser Pro 
210 215 220 

Ala His Ala Pro He Gly 
225 230 



<210> SEQ ID 15 
<211> Lenght: 348 
<212> Type. PRT 

<213> Organism: Pseudomonas fluorescens A2-2 
<400> SEQUENCE: 15 
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Met His Pro His Lys Thr Ala lie Val Leu lie Glu Tyr Gin Asn Asp 
^5 10 15 

Phe Thr Thr Pro Gly Gly Val Phe His Asp Ala Val Lys Asp Val Met 
20 25 30 

Gin Thr Ser Asn Met Leu Ala Asn Thr Ala Thr Thr lie Glu Gin Ala 
35 40 45 

Arg Lys Leu Gly Val Lys lie lie His Leu Pro lie Arg Phe Ala Asp 
50 55 60 

Gly Tyr Pro Glu Leu Thr Leu Arg Ser Tyr Gly He Leu Lys Gly Val 
" 70 75 80 

Ala Asp Gly Ser Ala Phe Arg Ala Gly Ser Trp Gly Ala Glu He Thr 
85 90 95 

Asp Ala Leu Lys Arg Asp Pro Thr Asp He Val He Glu Gly Lys Ara 
100 105 110 

Gly Leu Asp Ala Phe Ala Thr Thr Gly Leu Asp Leu Val Leu Arg Asn 
115 120 125 

Asn Gly He Gin Asn Leu Val Val Ala Gly Phe Leu Thr Asn Cys Cys 
130 135 140 

Val Glu Gly Thr Val Arg Ser Gly Tyr Glu Lys Gly Tyr Asp Val Val 

150 155 160 

Thr Leu Thr Asp Cys Thr Ala Thr Phe Ser Asp. Glu Gin Gin Arg Ala 
165 170 175 

Ala Glu Gin Phe Thr Leu Pro Met Phe Phe Ala Asn Pro Ala Thr His 
180 185 190 

Arg Val Ser Ala Ser Thr Glu Arg Arg He Lys Lys Ala Ala Thr Pro 
1^5 200 205 



Ala Glu Ser Pro Leu Phe Cys Leu Gly His Ser Val Gly Ala Tyr Cvs 
210 215 220 
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lie Ser Pro Phe Pro Asn Asp Gin Ser Ser Arg Phe Thr Ser Thr Ara 

225 2"^n ^ 



235 240 



Leu lie His Thr Ser Ser Leu Arg Ser Pro Val Leu Ala Trp Met Pro 

250 255 

Ser Ala Met Asn Leu Lys Ala Phe Phe Thr Ser Met Leu Arg Pro Ala 
260 2S5 270 

Phe His Val Thr Trp He Asn Thr He Leu Gly Val Val Thr Pro Arg 
275 280 285 

Tyr Pro Ala Ala Gly Thr Ser Ser Ser Leu Ala Trp Arg Leu Met He 



290 295 



300 



Trp Asn Leu Ser Cys Ser Gly Thr Leu Ala Thr Leu Val He Ala Ala 



305 



320 



Tyr Thr Thr Ser Pro Met Ala Val Ala Val Ser Val Glu Val Ser Ala 



325 



335 



Ala Arg Ser He Arg Thr Lys Gly Met Asp Lys Ser 
340 345 
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1. A gene cluster having open reading frames which encode polypeptides 
sufficient to direct the synthesis of a safracin molecule. 



2. A nucleic acid sequence comprising: 

a) a nucleic acid sequence encoding at least one non-ribosomal 
peptide synthetase which catalyse at least one step of the biosynthesis of 
safracins; 

b) a nucleic acid sequence which is complementary to the sequence 
in a); or 

c) variants or portions of the sequences of a) or b). 



3. The nucleic acid sequence according to claim 2 which comprises SEQ ID 
NO: 1, variants or portions thereof. 



4. The nucleic acid sequence according to claim 2 which comprises at least 
one of the sacA, sacB, sacC, sacD, sacE, sacF, sacG, sacH, sad, sacJ, orfl, 
orf2, orf3 or OTf4 genes, including variants or portions thereof. 



5. The nucleic acid sequence according to claim 2 wherein the nucleic acid 
encodes a polypeptide which is at least 30% identical in amino acid 
sequence to a polypeptide encoded by any of the saimcin gene cluster open 
reading frames saxiA to sacJ and orfl to orf4 (SEQ ID NO:l and genes 
encoded in SEQ ID NO:l) or encoded by a variant or portion thereof. 
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6. The nucleic acid sequence according to claim 2 which encodes for any of 
SacA, SacB, SacC, SacD, SacE, SacF, SacG, SacH, Sad, SacJ, Orfl, Orf2, 
Orf3 or Orf4 proteins (SEQ ID NO:2-15), and variants, mutants or portions 
thereof. 



7. The nucleic acid sequence according to claim 2 which encodes a peptide 
synthetase, a L-Tyr derivative hidroxylase, a L-Tyr derivative methylase, a 
L-Tyr O-methylase, a methyl-transferase or a monooxygenase or a safracin 
resistance protein. 



8. The nucleic acid sequence according to any one of claims 3-6 wherein 
the portion is at least 50 nucleotides in length. 



9. The nucleic acid sequence according to claim 8 wherein the portion 
the range between 100 to 5000 nucleotides in length. 



IS m 



10. The nucleic acid sequence according to claim 8 wherein the portion 
in the range between 100 to 2500 nucleotides in length. 



is 



11. A hybridization probe comprising a nucleic acid sequence according to 
any one of the preceding claims. 



12. The hybridization probe according to claim 11 which comprises a 
sequence of at least 10 nucleotide residues. 
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13. The hybridization probe according to claim 11 which comprises a 
sequence between 25 to 60 nucleotide residues. 



14. Use of a hybridization probe according to any one of claims 11-13 in 
the detection of a safracin or ecteinascidin gene. 

15. The use according to claim 14 wherein the gene detection is conducted 
in Bcteinascidia turbinatcu 



16. A pol3^eptide encoded by a nucleic acid sequence of any one of claims 
2-10. 



17. The polypeptide according to claim 16 which comprises an amino acid 
sequence selected from the group consisting of SEQ ID NO:2-15. 

18. A vector comprising a nucleic acid sequence of any one of claims 2-10. 

19. The vector according to claim 18 which is an expression vector. 



20. The vector according to claim 18 which is a cosmid. 
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21. A host ceU transformed with one or more of the nucleic acid sequences 
of any one of claims 2-10. 



22. A host cell comprising a vector of any one of claims 18-20. 



23. The host cell according to claim 22 wherein the host cell is 
transformed with an exogenous nucleic acid comprising a gene cluster 
encoding polypeptides sxifficient to direct the synthesis of a safracin. 



24. The host ceU according to claims 22 or 23 which is a microorganism. 



25. The host cell according to claim 24 which is a bacterium. 



26. A recombinant bacterial host cell in which at least a portion of a 
nucleic acid sequence of any one of claims 2-10 is disrupted to result in a 
recombinant host cell that produces altered levels of safracin compound or 
safracin analogue, relative to a corresponding nonrecombinant bacterial 
host cell. 



27. The recombinant ceU of claim 26. wherein the disrupted nucleic acid 
sequence is endogenous. 
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28. A method of producing a safracin compound or safracin analogue 
comprising fermenting an organism in which the copy mmiber of the gene 
cluster of claim 1 has been increased. 



29. A method of producing a safracin compound or safracin analogue 
comprising fermenting an organism in which expression of genes encoding 
polypeptides sufficient to direct the synthesis of a safracin or safracin 
analogue has been modulated by manipialation or replacement of one or 
more genes or sequence responsible for regulating such expression. 



30. A method of producing a safracin compoimd or safracin analogue 
comprising contacting a compoxmd that is a substrate for a polypeptide 
encoded by one or more of the open reading frames of the safracin 
biosynthesis gene cluster of claim 1 with said polypeptide, wherein the 
polypeptide chemically modifies the compound. 



31. The method according to claims 28 or 29 wherein the organism is 
Pseudomonas sp. 



32. A composition comprising at least one nucleic acid sequence of any 
one of claims 2-10. 



33 Use of a composition according to claim 32 for the combinatorial 
biosynthesis of one or more of non-ribosomal peptide synthetases, 
diketopiperazine rings and safracins. 
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34. Use of P2, P14, analogs and derivatives thereof in combinatorial 
biosynthesis of one or more of non-ribosomal peptide synthetases, 
diketopiperazine rings and safracins. 



35. A safracin compound obtainable by a method according to any of 
claims 28-31. 



36. A safracin compound according to claim 35 wherein the compound 
has one of the following formulas 

OMe 




NH2 





OMe 
HOs,Js. Me 





Me 



OEt 
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37. Use of a compoxind according to claims 35 or 36 as an antitumor 
agent. 



38. Use of a compound according to claims 35 or 36 in the manufacture of 
a medicament for the treatment of cancer. 



39. Use of a compound according to claims 35 or 36 as an antimicrobial 
agent. 



40. Use of a compound according to claims 35 or 36 in the manxifacture of 
a medicament for the treatment of microbial infections. 



41. A pharmaceutical composition comprising a compoimd according to 
claims 35 or 36 and a pharmaceuticalfy acceptable diluent, carrier or 
excipient. 



42. Use of a compound according to claims 35 or 36 in the synthesis of 
ecteinascidin compoiands. 
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Facsimile No. 

01223 365560 


Tel^rinterNo. 


Agent's registration No. with the Office 


1 — 1 Address for correspondence: Mark this check-box where no agent or common representative is/has been appointed and the 
1 1 space above is used instead to indicate a special address to whicn correspondence should be sent 
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SheetNo. ...2., 



Continuation of Box No. IH^ FURTHER APPLICANT(S) AND/OR (FURTHER) INVENTOR(S) 

If none of the following sub-boxes is used, this sheet should not be included in the request. 



Name and address: (Family namefoUo^ed by given name: for a legal entity. Jullof^ 

The address must include postal code and name of country. The country of the address indicated in this 

Box is the applicant 's State (that is, country) of residence if no State of residence is indicated below.) 

de la Calle. Fernando 

c/o Pharma Mar, S.A. 

Calle de la Calera 3 

Poligono Industrial de Tres Cantos 

Tres Cantos, Madrid, 28760, Spain 



This person is: 

I I applicant only 

IX I applicant and inventor 

□ inventor only (If this check-box 
is marked, do not fill in below) 



Applicant's registration No. with the Office 



State (that is, country) of nationality: 

ES 


State (that is, country) of residence: 

ES 


This person is applicant i 1 all designated 1 1 all designated States except f^l the United States |~n the States indicated in 

for the purposes of: 1 1 States I I the United States of Amenca LfLJ of America only 1 1 the Supplemental Box 


Name and address: (Family name followed by given name; for a legal entity, Jull official designation. 
The address must inclitde postal code and name of country. The country of the address indicated in this 
Box is the applicant's State(thatis, country) of residence if no State of residence is vuUcated below.) 

Aparicio Perez, Tomas 
Poligono Industrial La Mina 
Avda. de los Reyes, 1 
Colmenar Viejo, 
Madrid, 28770, Spain 


This person is: 

1 1 applicant only 

jy 1 applicant and inventor 

inventor only (If this check-box 
\ — 1 is marked, do not fill in below.) 


Applicant* s registration No. with the Office 


State (that is, country) of nationality: 

ES 


State (that is, country) of residence: 

ES 


This person is applicant i 1 all designated | 1 all designated States except njn the United States 1 1 the States indicated in 

for the purposes of: i 1 States 1 I the United States of Amenca lA i of America only 1 1 the Supplemental Box 


Name and address: (Family name followed by given name; for a legal entity, full official designation. 
The address must include postal code and name of country. The country of the address indicated in this 
Box is the c^plicani 's State (that is. country) of residence if no State of residence is iru&cated below.) 

Schleissner Sanchez, Carmen 
Poligono Industrial La Mina 
Avda. de los Reyes, 1 
Colmenar Viejo, 
Madrid, 28770, Spain 


This person is: 

1 1 applicant only 

IX 1 applicant and inventor 

1 1 inventor only (If this check-box 
l—J is marked, do not fill in below.) 


Applicant' s registration No. with the Office 


State (that is, country) of nationality: 

ES 


State (that is, country) of residence: 

ES 


This person is applicant l 1 all designated 1 1 all desisted States except fyi the United States 1 1 the States indicated in 

for the purposes of: 1 1 States 1 1 the Umted States of Amenca 1^ 1 of America only 1 1 the Supplemental Box 


Name and address: (Family name followed by given name; for a legal entity, full official designation. 
The address must include postal code and name of country. The country of the address indicated in this 
Box is the applicant 's State (that is, country) of residence if no State of residence is indicated below.) 

Acebo Pais, Paloma 
Poligono Industrial La Mina 
Avda. de los Reyes, 1 
Colmenar Viejo, 
Madrid, 28770, Spain 


This person is: 

1 1 applicant only 

IX 1 applicant and inventor 

inventor only (If this check-box 
L— 1 is marked, do not fill in below) 


Applicant's registration No. with the Office 


State (that is, country) of nationality: 

ES 


State (that is, country) of residence: 

ES 


This person is applicant j 1 all designated | j all designated States except rrri the United States 1 1 the States indicated in 

for the purposes of: i 1 States 1 1 the United States of Amenca l#» 1 of America only i 1 the Supplemental Box 


1X1 Further applicants and/or (further) inventors are indicated on another continuation sheet. 
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Sheet No. ...3. 



I ^ ^^^nI 

:ontinuation of Box No. HI FURTHER APPLICANT(S) AND/OR (FURTHER) INVENTOR(S) 

If none of the following sub-boxes is used this sheet should not be included in the request 



Name and address: (Family ncmiefollowed by given najne; for a legal entity, full ojgU^ 

The address must include postal code and name of country. The country of the address indicated in this 

Box is the applicant 's State (that is. country) of residence if no State of residence is indicated below,) 

Rodriguez Ramos, Pilar 
Poligono Industrial La Mina 
Avda. de los Reyes, 1 
Colmenar Viejo, 
Madrid. 28770, Spain 



This person is: 

I I applicant only 

[y I applicant and inventor 

□ inventor only (If this check-box 
is marked do not fill in below.) 



>^plicant*s registration No. with the Office 



State (that is. country) of nationality: 

ES 


State (that is, country) 

ES 


of residence: 


This person is applicant l 1 all designated | 1 ^1 designated States except [jTl Uj^J^iStotcs 1 | the Sjates in^c^ in 

for the purposes of: 1 1 States I I the United States of Amenca i&J of Amenca only | | the Supplemental Box 


Name and address: (Family namefollowed by gi\^ name: for a legal entity, full official designation. 
The address must inched postal code and name cf country. The country of the address indicated in this 
Box is the applicant's State(thatis, country) of residence if no State of residence is indicated below.) 

Reyes Benftez, Fernando 
Poligono Industrial La Mina 
Avda. de los Reyes, 1 
Colmenar Viejo 
Madrid, 28770, Spain 


This person is: 

1 1 applicant only 

IX 1 applicant and inventor 

inventor only (If this check-box 
1 1 is marked, do not fill in below.) 


Applicant's registration No. with the Office 


State (that is, country) of nationality: 

ES 


State (that is, country) of residence: 

ES 


This person is applicant I 1 all designated I 1 all designated States except fvl the United States | | the States indicate^ in 

for the piuposeT of 1 1 States I I the United States of Amenca XJ of Amenca only j | the Supplemental Box 


Name and address: (Family namefollowed by given name; for a legal entity, full official designatit^ 
The address must include postal cock and name of country. The country of the address indicated in this 
Box is theapplicant 's State (that is, country) cfresidenceifno State ofresidence is indicated below.) 

Henriquez Pelaez, Ruben 
Poligono Industrial La Mina 
Avda. de los Reyes, 1 
Colmenar Viejo 
Madrid, 28770, Spain 


This person is: 

1 1 applicant only 

|y 1 applicant and inventor 

1 1 inventor only (If this check-box 

\ 1 is marked, do not fill in below.) 


Applicant * s regi stration No. with the Office 


State (that is, country) of nationality: 

ES 


State (that is, country) ofresidence: 

ES 


This person is applicant | 1 all designated 1 1 all designated States except flTl the United States | | tfic States indicated in 

for the purposes of: 1 1 States | | the United States of Amenca IsJ of Amenca only | 1 the Supplemental Box 


Name and address: (Family namefolhwed by given name; for a legal entity, full official designation. 
The address must include postal code and name of country. The country of the address indicated in this 
Boxisthecpplicant's State(thatis, country) ofresuknceyrio State ofresidence is indicated below.) 

Ruffles, Graham Keith 
66-68 Hills Road 
Cambridgeshire CB2 1 LA 
United Kingdom 


This person is: 

IX 1 applicant only 

1 1 applicant and inventor 

inventor only (If this check-box 
l—J is marked, do not fill in below,) 


Applicant' s registration No. with the Office 


State (that is, country) of nationality: 
ES 


State (that is, country) of residence: 

ES 


This person is applicant l 1 all designated i 1 all designated States except 1 1 the United States r^l Ae States indicated in 

for the pmposes 1 1 States | | the United States of Amenca | | of America only [XJ the Supplemental Box 


1 1 Further applicants and/or (further) inventors are indicated on another continuation sheet. 
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Box No. V DKSIGNAirol ^FSTATES Mark the cpplicable check-boxes belo^^.^ast one must be marked 

The following designations are hereby made under Rule 4.9(a): 

Regional Patent ^ , ^ 

IB AP ARIPO Patent: GH Ghana, GM Gambia, KE Kenya, LS Lesotho, MW Malawi, MZ Mozambique, SD Sudan, 
SL SierraLeone, SZ Swaziland, TZ United Republic of Tanzania, UG Uganda, ZM Zambia, ZW Zimbabwe, and any other 
State which is a Contracting State of the Harare Protocol and of the PCT (if other kind of protection or treatment desired, 

specify on dotted line) 

EA Eurasian Patent: AM Armenia, AZ Azerbaijan, BY Belarus, KG Kyrgyzstan, KZ Kazakhstan, MD Republic of Moldova. 
RU Russian Federation, TJ Tajikistan. TM Turkmenistan, and any other State which is a Contractmg State of the Eurasian 
Patent Convention and of the PCT 
B EP European Patent: AT Austria, BE Belgium, BG Bulgaria, CH& LI Switzerland and Liechtenstein. CYC^ 

Republic DE Germany, DK Denmark, EE Estonia, ES Spain, FI Finland, FR France. GB United Kingdom. GR Greece. 
HIJ Hungary IE Ireland, IT Italy, LU Luxembourg. MC Monaco. NL Netherlands, PT Portugal. RO Romania. SE Sweden. 
SI Slovenia, SK Slovakia. TR Turkey, and any other State which is a Contracting State of the European Patent Convention 
and of the PCT 

Bl OA OAPI Patent: BF Burkina Faso. BJ Benin. CF Central African Republic, CG Congo, CI Cote d 'I voire, CM C^^ 

GA Gabon GN Guinea, GQ Equatorial Guinea. GW Guinea-Bissau. ML Mali. MR Mauritania, NE Niger. SN Senegal 
TD Chad, TG Togo, and any other State which is a member State of OAPI and a Contracting State of the PCT (if other kind 

of protection or treatment desired, specify on dotted line) 

National Patent (if other kind of protection or treatment desired, specify on dotted line): 

B AE United Arab Emirates B HR Croatia g OMOman 

(0 AG Antigua and Barbuda H HU Hungary g PG Papua New Guinea 

B AL Albania » n> Indonesia B PH Philippines 

B AMAnnenia » IL Israel » PL Poland 

B AT Austria » IN India g PT Portugal 

B AU Australia » IS Iceland g RO Romania 

B AZ Azerbaijan B JP Japan » RU Russian Federation 

B BA Bosnia and Herzegovina B KE Kenya 

B BB Barbados B KG Kyrgyzstan B SC Seychelles 

B BG Bulgaria B KP Democratic People's Republic IB SD Sudan 

B BR Brazil of Korea g SE Sweden 

B BY Belarus B KR Republic of Korea H SG Singapore 

B BZ Belize B KZ Kazakhstan B SK Slovakia 

B CA Canada B3 Saint Lucia B SL Sierra Leone 

B CH & LI Switzeriand and Liechtenstein B LK Sri Lanka B SY Syrian Arab Republic 

B CN China B LR Liberia g TJ Tajikistan 

B CO Colombia B LS Lesotho g TM Turkmenistan 

B CR Costa Rica 68 LT Lithuania g TN Tunisia 

B CU Cuba IB LU Luxembourg g TR Turkey 

B CZ Czech Republic H LV Latvia H TT Trinidad and Tobago 

B DE Germany B MA Morocco 

B DK Denmark JB MD Republic of Moldova B TZ United Republic of Tanzania 

B DM Dominica B UA Ukraine 

B DZ Algeria BB MG Madagascar g UG Uganda 

B EC Ecuador B MKThe former Yugoslav Republic of B US United Sutes of America 

B EE Estonia Macedonia 

B ES Spain H MN Mongolia B UZ Uzbekistan 

B FI Finland B MWMalawi B VC Saint Vincent and the Grenadines 

B GB United Kingdom B MX Mexico B VN Viet Nam 

B GD Grenada B MZ Mozambique B YU Serbia and Montenegro 

B GE Georgia B Nl Nicaragua B ZA South Africa 

B GH Ghana B NO Norway B ZM Zambia 

B GM Gambia B NZ New Zealand B ZW Zimbabwe 



Check-boxes below reserved for designating States which have become party to the PCT after issuance of this sheet: 
B .Byy.e.Qt9W9na B .eg Egypt □ 



Precautionary Designation Statement: In addition to the designations made above, the applicant also makes under Rule 4.9(b) all 
other designations which would be permitted under the PCT except any designation(s) indicated in the Supplemental Box as bemg 
excluded from the scope of this statement. The applicant declares that those additional designations are subject to confinnation and that 
any designation which is not confirmed before the expiration of 1 5 months from the priority date is to be regarded as withdrawn by the 
applicant at the expiration of that time limit (Confirmation (including fees) must reach the receiving Office within the J 5-month time limit,) 
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Supplemental Box IftSe Supplemental Box is not used this sheet should not be included in the request. 

1. If in any of the Boxes, except Boxes Nos. VUIfi) to (v)for which 
a special continuation box is provided, the space is insufficient 
to furnish all the information: in such case, write "Continuation 
of Box No.,.. " (indicate the number of the Box) and furnish the 
information in the same manner as required according to the 
ccqftions of the Box in which the space was insufficient, in 
particular: 

(i) if more than two persons are to be indicated as applicants 
and/or inventors and no ** continuation sheet " is available: in 
such case, write "Continuation of Box No. Ill" and indicate for 
each additional person the same type ofinformation as required 
in Box No. Ill The country of the address indicated in this Box 
is the applicant 's State (that is, country) of residence if no State 
of residence is indicated below; 

(ii) if, in Box No. II or in any of the sub-boxes of Box No. Ill, the 
indication *^e States indicated in the Stqfplemental Box^ is 
checked: in such case, write "Continuation of Box No. 11" or 
"Continuation ofBox No. m" or "Continuation of Boxes No. 11 
and No. HI" (as the case may be), indicate the name of the 
applicant(s) involved and, next to (each) such name, theState(s) 
(and/or, where explicable, ABJPO, Eurasian, European or 
OAPI patent) for the purposes of which the named person is 
applicant; 

(Hi) if in Box No. II or in any of the sub-boxes of Box No. HI, the 

inventor or the inventor/t^licant is not inventor for the Continuation of Box II 

'{^S^o/^^n^Z{%rTe^cZZ^XZ Graham Keith is co-applicant for SD (Sudan) 

Box No. 11" or "Continuation of Box No. Ill" or "Continuation Only 
of Boxes No. II and No. Ill" (as the case may be), indicate the 
name of the inventor (s) and, next to (each) such name, 
theState(s) (and/or, where applicable, ARIPO, Eurasian, 
European or OAPI patent) for the purposes of which the 
named person is inventor; 

(iv) if in addition to the agent(s) indicated in Box No. IV, there are 
further agents: in such case, write "Continuation of 
Box No. IV** and indicate for each Jurther agent the same type 
of information as required in Box No. IV; 

(v) if inBoxNo. V, thename of any State (or OAPI) is accompanied 
by the indication "patent of addition, " or "certificate of 
addition, " or if, in Box No. V, the name of the United States of 
America is accompanied by an indication "continuation " or 
"continuation-in-part": in such case, write "Continuation of 
Box No. V" and name of each State involved (or OAPI), 
and after the name of each such State (or OAPI), the number of 
the parent title or parent application and the date of grant of 
the parent title or filing of the parent explication; 

(vi) if, in Box No. VI, there are more than five earlier applications 
whose priority is claimed: in such case, write "Continuation 
of Box No. VI" and indicate for each additional earlier 
application the same type of information as required 
in Box No. VI. 



2. If, with regard to the precautionary designation statement 
contained in Box No. V, the applicant wishes to exclude any 
State(s) from the scope of that statement: in such case, write 
"Designation(s) excluded from precautionary designation 
statement" and indicate the name or two-letter code of each 
State so excluded 
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iox No. VI PRIORITY CLAIM 



The priority of the following earlier application(s) is hereby claimed: 



Filing date 
of earlier application 
(day/monthfyear) 



Number 
of earlier application 



Where earlier application is: 



national application: 
country or Member 
of WTO 



regional application:* 
regional Office 



international application: 
receiving Office 



item (1) 
20 December 2002 
(20/12/02) 



0229793.5 



GB 



item (2) 



item (3) 



item (4) 



item (5) 



I I Further priority claims are indicated in the Supplemental Box. 



The receiving Office is requested to prepare and transmit to the International Bureau a certified copy of the earlier application(s) (only 
if the earlier application way filed with the Office which for the purposes of this international application is the receiving Office) identified 
above as: ^ ^ other see 

□ all items [g item(l) □ item (2) □ item (3) □ item (4) □ item (5) □ Supplemental Box 

* mere the earlier application is an ARIPO explication, indicate at least one country party to the Pcms ^^^'^^'^JL/'f f^.^^^!^^'*'^ 
Industrial Property or one Member of the World Trade Organization for which that earlier application was filed (Rule 4.}0(b)Cii)): .... 



Box No. Vn INTERNATIONAL SEARCHING AUTHORITY 



Choice of International Searching Authority (ISA) (if two or more International Searching AuthoriHes are competent to cany out the 
international search, indicate the Authority chosen; the two-letter code may he used)'. 



ISA/ 

Request to use results of eartier search; reference to that search (if an earlier search has been earned out by or requested from the 
International Searching Authority): 

Date (day/monih/year) Number Country (or regional Office) 



BoxNo-Vm DECLARATIONS 



The following declarations are contained in Boxes Nos. VTII (i) to (v) (mark the applicable 
check-boxes below and indicate in the right column the number of each type of declaration): 



Number of 
declarations 



□ BoxNo.Vin(i) 
{~| Box No. Vin (ii) 

□ Box No. Vni (iii) 
Q Box No. Vin (iv) 

□ BoxNo.Vni (v) 



Declaration as to the identity of the inventor 

Declaration as to the applicant's entitlement, as at the international filing 

date, to apply for and be granted a patent : 

Declaration as to the applicant's entitlement, as at the international filing 
date, to claim the priority of the earlier application 

Declaration of inventorship (only for the purposes of the designation of the 
United States of America) 

Declaration as to non-prejudicial disclosures or exceptions to lack of novelty 
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^ox No. IX CHECK LIST; LANGUAGE OF FILING 



This international application contains: 

(a) the following number of 
sheets in paper form: 

request (including 
declaration sheets) 

description (excluding 
sequence listing part) 

claims 

abstract 

drawings 

Sub-total number of sheets 



107 
7 
1 
9 



131 



sequence listing part of 
description (actual number 
of sheets if filed in paper 
form, whether or not also 
filed in computer readable 
form: see (b) below) 

Total number of sheets 



131 



(b) sequence listing part of description filed in 
computer readable form 

(i) □ only (under Section 801 (a)(i)) 

(ii) □ in addition to being filed in paper 

form (under Section 801(a)(ii)) 

Type and number of carriers (diskette, 
CD-ROM, CD-R or other) on which the 
sequence listing part is contained (additional 
copies to be indicated under item 9(ii), in 
right column)'. 



This international application is accompanied by the following 
item(s) (mark the applicable check-boxes below and indicate in 
right column the number of each item): 



Number 
of items 



IB 

3- □ 

4 a 

5- □ 

6. □ 

9 □ 



fee calculation sheet 

original separate power of attorney 

original general power of attorney 

copy of general power of attorney; reference number, 
if any: 

statement explaining lack of signature 

priority document(s) identified in Box No. VI as 



item(s): 

translation of international application into 

(language): : 

separate indications concerning deposited microorganism 
or other biological material 

sequence listing in computer readable form (indicate also type 
and number of carriers (diskette, CD-ROM, CD-R or other )) 

(i) Q copy submitted for the purposes of international search 

under Rule 1 3/er only (and not as part of the 
international application) 

(ii) □ (only where check-box (b)(i) or (b)(ii) is marked in left 

column) additional copies including, where applicable, 
the copy for the purposes of international search under 
Rule 13/cr : 

(iii) □ together with relevant statement as to the identity 

of the copy or copies with the sequence listing part 
mentioned in left column : 



10. g| other (specify): Form 23/77 



Figure of the drawings which 
should accompany the abstract: 



Language of filing of the 
international application: English 



Box No. X SIGNATURE OF APPLICANT, AGENT OR COMMON REPRESENTATIVE 

Next to each signature, iruScate the name of the person signing and the cc^tcity in whidi the person signs (fsuch apaaty is not obvious from reacSng the request). 




• For receiving Office use only t 



1 . Date of actual receipt of the purported 
international application: 



3. Corrected date of actual receipt due to later but 
timely received papers or drawings completing 
the purported international application: 



4. Date of timely receipt of the required 
corrections undo- PCT Article 1 1(2): 



5. International Searching Authority 

(if two or more are competent): ISA / 



□ Transmittal of search copy delayed 
uintil search fee is paid 



2. Drawings: 
I I received: 

I I not received: 



For International Bureau use only , 



Date of receipt of the record copy 
by the International Bureau: 
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This shS^^not part of and does not count as a sheet of the inf^^mnal application. 

'J' , For receiving Office use only 



FEE CALCULATION SHEET 
Annex to the Request 



Applicant's or agent's 
file reference 



WPP287203 



International Application No. 



Date stamp of the receiving Office 



Applicant 

Pharma Mar, S-A. et al 



CALCULATION OF PRESCRIBED FEES 

1. TRANSMITTAL FEE 



55.00 rn 



2. SEARCH FEE 

International search to be carried out by 



640.00 [Tl 



a/two or more International Searching Authorities are competent to carry out the international 
search, imScate the name of the Authority wAicfc is chosen to cany out the mtemational search.) 



3. INTERNATIONAL FEE 
Basic Fee 



Where item (b) of Box No. IX applies, enter Sub-total number of sheets | 
Where item (b) of Box No. DC does not ^ply, enter Total number of sheets / 

[bl] first 30 sheets 

[bH. 



131 



278 fbi] 



101 



6 



606ib2l 



fee per sheet 



number of sheets 
in excess of 30 

bi] additional component (only if sequence listing part of description 
is filed in computer readable form under Section 801(a)(i), or 
both in that form and on paper, under Section 801(a)(ii)): 



400 X 



JbH 



fee per sheet 

Add amounts entered at bl, b2 and b3 and enter total at B 



884 [bI 



Designation Fees 

The international application contains . 



All designations. 



60 



number of designation fees 
payable (maximum 5) 



amount of designation fee 



= L 



300 fPl 



Add amounts entered at B and D and enter total at I 

(Applicants from certain States are entitled to a reduction of 75% of the 
intemationcufee. Where the applicant is (or all applicants are) so entitled, the total 
to be entered at I is 25% of the sum of the amounts entered at B and D.) 



1184 rn 



4. FEE FOR PRIORITY DOCUMENT (if applicable) 



22 rn 



TOTAL FEES PAYABLE 

Add amounts entered at T, S, I and P, and enter total in the TOTAL box 



1901 



TOTAL 



n The designation fees are not paid at this time. 



MODE OF PAYMENT 

H authorization to charge 
deposit account (see below) 

I I cheque 



I I postal money order 
n bank draft 



n cash 

I I revenue stamps 



I I coupons 

I I other (specify): 



AUTHORIZATION TO CHARGE (OR CREDIT) DEPOSIT ACCOUNT 
(This mode of payment may not be available at all receiving Offices) 

133 Authorization to charge the total fees indicated above. 

n (This check-box may be marked only if the conditions for deposit accounts 
ofthe receiving (jffice so permit) Authorization to charge any deficiency 
or credit any overpayment in the total fees indicated above. 

n Authorization to charge the fee for priority document. 



Receiving Office: RO/ GB 

Deposit Account No.: _ D10176 



Date: 18 December 2003 



Name: L. Gannon 



Signature: 
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PATENT COOPERATION TREATY 

From the INT^(J|^ONAL BUREAU 



PCT/GB2003/0055I 



NOTICE INFORMING THE APPUCANT OF THE 
COMMUNICATION OF THE INTERNATIONAL 
APPLICATION TO THE DESIGNATED OFFICES 

(PCT Rule 47. 1(c), first sentence) 



Dale of niailine (da\/mont}i/\ear) 

08 July 2004 (08.07.2004) 



To: 



RUFFLES, Graham, keith 
Marks & Clerk 
66-68 Hills Road 
Cambridgeshire CB2 1 LA 
ROYAUME-UNI 



Applicant's or ageni's file reference 
WPP287203 



IMPORTANT NOTICE 



Intemaiional application No. 
PCT/GB2003/005563 



International filing date (day/montlt/xear) 

19 December 2003 (19. 12.2003) 



Priority date {da\/montJi/\ear) 

20 December 2002 (20.12.2002) 



Applicant 



PHARMA MAR, S.A. et al 



1. Notice is hereby given that the International Bureau has communicated, as provided in Article 20, the inteniational application to the 
following designated Offices on the date indicated above as the date of mailing of this notice: 

AU. AZ, BY, CH. CN, CO, DZ, EP, HU, JP, KG, KP, KR, MD. MK, MZ, RU, TM, US 

In accordance with Rule 47.1(c), third sentence, those Offices will accept the present notice as conclusive evidence that the communication 
of the international application has duly taken place on the date of mailing indicated above and no copy of the international application is 
required to be furnished by the applicant to the designated Office(s). 

2. The following designated Offices have waived the requirement for such a communication at this time: 

AE, AG, AL, AM. AP, AT, BA, BB, BG, BR, BW, BZ, CA, CR, CU. CZ, DE, DK, DM, EA, EC, EE, EG, ES, Fl, GB. GD, 
GE, GH, GM, HR, ID, IL, IN, IS, KE, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MG. MN, MW, MX, Nl, NO, NZ, OA. OM, PG. 
PH, PL, PT, RO, SC. SD. SE, SG, SK, SL, SY, TJ, TN, TR, TT, TZ, UA, UG, UZ, VC, VN, YU, ZA, ZM, ZW 
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